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Background of the Invention 
This application is a continuation-in-part of U.S. Patent 
Application of Alizon et al. for "Cloned DNA Se- 
quences Related to the Entire Genomic RNA of Human 
immunodeficiency Virus II (HIV-2), Polypeptides Encoded by these 
DNA Sequences and Use of these DNA Clones and Polypeptides in 
Diagnostic Kits," filed January 16, 1987, which is a 
continuation-in-part of U.S. Patent Application Serial No. 
931,866 filed November 21, 1986, which is a continuation-in-part 
application of U.S. Patent Application Serial No. 916,080 of _ 
Montagnier et al. for "Cloned DNA Sequences Related to the 
Genomic RNA of the Human Immunodeficiency Virus II (HIV-2), Poly- 
peptides Encoded by these DNA Sequences and Use of these DNA 
Clones and Polypeptides in Diagnostic Kits," filed October 6, 
1986 and U.S. Patent Application Serial No. 835,228 of Montagnier 
et al. for "New Retrovirus Capable of Causing AIDS, Antigens 
Obtained from this Retrovirus and Corresponding Antibodies and 
their Application for Diagnostic Purposes," filed March 3 f 198-6. 
The disclosures of each of these predecessor applications are 
expressly incorporated herein by reference. 

The invention relates to cloned DNA sequences analogous to 
the genomic RNA of a virus known as Lymphadenopathy- Associated 
Virus II ( "LAV-II" ) , a process for the preparation of these 
cloned DNA sequences, and their use as probes in diagnostic kits. 
In one embodiment, the invention relates to a cloned DNA sequence 
analogous to the entire genomic RNA of HIV-2 and its use as a 
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probe. The invention also relates to polypeptides with amino 
acid sequences encoded by these cloned DNA sequences and the use 
of these polypeptides in diagnostic kits. 

According to recently adopted nomenclature, as reported in 
Nature, May 1986, a substantially-identical group of retroviruses 
which has been identified as one causative agent of AIDS are now 
referred to as Human Immunodeficiency Viruses I (HIV-1). This 
previously-described group of retroviruses includes 
Lymphadenopathy-Associated Virus I (LAV-I), Human T-cell 
Lymphotropic Virus-Ill (HTLV-lli), and AIDS-Related Virus (ARV). 

Lymphadenopathy-Associated Virus II has been described in 
United States Application Serial No. 835,228, which was filed 
March 3, 1986, and is specifically incorporated herein by refer- 
ence. Because LAV- 1 1 is a second , distinct causative agent of 
AIDS, LAV-II properly is classifiable as a Human Immunodeficiency 
Virus II (HIV-2). Therefore, "LAV-II" as used hereinafter 
describes a particular genus of HIV-2 isolates. 

While HIV-2 is related to HIV-1 by its morphology, its 
tropism and its in vitro cytopathic effect on CD 4 (T4) positive 
cell lines and lymphocytes, HIV-2 differs from previously 
described human retroviruses known to be responsible for AIDS. 
Moreover, the proteins of HIV-1 and 2 have different sizes and 
their serological cross-reactivity is restricted mostly to the 
major core protein, as the envelope glycoproteins of HIV-2 are 
not immune precipitated by HIV-l-posit ive sera except in some 
cases where very faint cross-reactivity can be detected. Since a 
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significant proportion of the HIV infected patients lack 
antibodies to the major core protein of their infecting virus, it 
is important to include antigens to both HIV-1 and HIV-2 in an 
effective serum test for the diagnosis of the infection by these 
viruses. 

HIV-2 was first discovered in the course of serological re- 
search on patients native to Guinea-Bissau who exhibited clinical 
and immunological symptoms of AIDS and from whom sero-negative or 
weakly sero-positive reactions to tests using an HIV-1 lysate 
were obtained. Further clinical studies on these patients iso- 
lated viruses which were subsequently named "LAV-II." 

One LAV-II isolate, subsequently referred to as LAV-II MIR, 
was deposited at the Collection Nationale des Cultures de Micro- 
Organismes (CNCM) at the Institut Pasteur in Paris, France on 
December 19, 1985 under Accession No. 1-502 and has also been 
deposited at the British ECA CC under No. 87.001.001 on January 
9, 1987. a second LAV-II isolate was deposited at CNCM on 
February 21, 1986 under Accession No. 1-532 and has also been 
deposited at the British ECA CC under No. 87.001.002 on January 
9, 1987. This second isolate has been subsequently referred to 
as LAV-II ROD. Other isolates deposited at the CNCM on December 
19, 1986 are HIV-2 IRMO (No. 1-642) and HIV-2 EHO (No. 1-643). 
Several additional isolates have been obtained from West African 
patients, some of whom have AIDS, others with AIDS-related condi- 
tions and others with no AIDS symptoms. All of these viruses 
have been isolated on normal human lymphocyte cultures and some 
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of them were thereafter propagated on lymphoid tumor cell lines 
such as CEM and MOLT. 

Due to the sero-negat ive or weak sero-posit ive results 
obtained when using kits designed to identify HIV-1 infections in 
the diagnosis of these new patients with HIV-2 disease, it has 
been necessary to devise a new diagnostic kit capable of 
detecting HIV-2 infection, either by itself or in combination 
with an HIV-1 infection. The present inventors have, through the 
development of cloned DNA sequences analogous to at least a por- 
tion of the genomic RNA of LAV- I I ROD viruses, created the mate- 
rials necessary for the development of such kits. 

Summary of the Invention 

As noted previously, the present invention relates to the 
cloned nucleotide sequences homologous or identical to at least a 
portion of the genomic RNA of HIV-2 viruses and to polypeptides 
encoded by the same. The present invention also relates to kits 
capable of diagnosing an HIV-2 infection. 

Thus, a main object of the present invention is to provide a 
kit capable of diagnosing an infection caused by the HIV-2 virus. 
This kit may operate by detecting at least a portion of the RNA 
genome of the HIV-2 virus or the provirus present in the infected 
cells through hybridization with a DNA probe or it may operate 
through the immunodiagnost ic detection of polypeptides unique to 
the HIV-2 virus. 

Additional objects and advantages of the present invention 
will be set forth in part in the description which follows, or 
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may be learned from practice of the invention* The objects and 
advantages may be realized and attained by means of the instru- 
mentalities and combinations particularly pointed out in the 
appended claims. 

To achieve these objects and in accordance with the purposes 
of the present invention, cloned DNA sequences related to the 
entire genomic RNA of the LAV-II virus are set forth. These se- 
quences are analogous specifically to the entire genome of the 
LAV-II ROD strain. 

To further achieve the objects and in accordance with the 
purposes of the present invention, a kit capable of diagnosing an 
HIV-2 infection is described. This kit, in one embodiment, con- 
tains the cloned DNA sequences of this invention which are capa- 
ble of hybridizing to viral RNA or analogous DNA sequences to in- 
dicate the presence of an HIV-2 infection. Different diagnostic 
techniques can be used which include, but are not limited to: 
(1) Southern blot procedures to identify viral DNA which may or 
may not be digested with restriction enzymes; (2) Northern blot 
techniques to identify viral RNA extracted from cells; and 
(3) dot blot techniques, i.e., direct filtration of the sample 
through an ad hoc membrane such as nitrocellulose or nylon with- 
out previous separation on agarose gel. Suitable material for 
dot blot technique could be obtained from body fluids including, 
but not limited to, serum and plasma, supernatants from culture 
cells, or cytoplasmic extracts obtained after cell lysis and re- 
moval of membranes and nuclei of the cells by 
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ultra-centrif ugation as accomplished in the "CYTODOT" procedure 
as described in a booklet published by Schleicher and Schull. 

In an alternate embodiment, the kit contains the poly- 
peptides created using these cloned DMA sequences. These poly- 
peptides are capable of reacting with antibodies to the HIV-2 
virus present in sera of infected individuals, thus yielding an 
immunod i agnos tic comp lex * 

To further achieve the objects of the invention, a 
vaccinating agent is provided which comprises at least one 
peptide selected from the polypeptide expression products of the 
viral DNA in admixture with suitable carriers, adjuvents stabi- 
lizers. 

It is understood that both the foregoing general description 
and the following detailed description are exemplary and explana- 
tory only and are not restrictive of the invention as claimed. 
The accompanying drawings, which are incorporated in and consti- 
tute a part of the specification, illustrate one embodiment of 
the invention and, together vith the description, serve to 
explain the principles of the invention. 

Brief Description of the Drawings 

Figure 1 generally depicts the nucleotide sequence of a 
cloned complementary DNA (cDNA) to the genomic RNA of HIV-2. 
Figure 1A depicts the genetic organization of HIV-1, position of 
the HIV-1 Hindi 1 1 fragment used as a probe to screen the cDNA li- 
brary, and restriction map of the HIV-2 cDNA clone, E2. 
Figure IB depicts the nucleotide sequence of the 3 1 end of HIV-2. 
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The corresponding region of the HIV-1 LTR was aligned using the 
Wilbur and Lipman algorithm (window: 10; K-tuple: 7; gap penalty: 
3) as described by Wilbur and Lipman in Proc. Natl. Acad. Sci. 
USA 80: 726-730 (1983), specifically incorporated herein by ref- 
erence. The U3-R junction in HIV-1 is indicated and the poly A 
addition signal and potential TATA promoter regions are boxed. 
In Figure IB, the symbols B, H, Ps and Pv refer to the restric- 
tion sites Bam HI , Hind i 1 1 , PstI and Pvu l I , respectively. 

Figure 2 generally depicts the HIV-2 specificity of the E2 
clone. Figure 2A and B specifically depict a Southern blot of 
DNA extracted from CEM cells infected with the following iso- 
lates: HIV-2 R0D <a,c), HIV-2 DUL (b,d), and HIV-1 BRU <e,f). 
DNA in lanes a,b,f was Pst I digested; in c,d,e DNA was 
undigested. Figure 2C and D specifically depict dot blot hy- 
bridization of pelleted virions from CEM cells infected by the 
HIV-1q RU (1), Simian Immunodeficiency Virus (SIV) isolate Mm 
142-83 (3), HIV-2 DUL (4), HIV-2 R0D (5), and HIV-1 ELI (6). 
Dot 2 is a pellet from an equivalent volume of supernatant from 
uninfected CEM. Thus, Figure 2A and C depicts hybridization with 
the HIV-2 cDNA (E2) and Figure 2B and D depicts hybridization to 
an HIV-1 probe consisting of a 9Kb Sac I insert from HIV-1 
BRU(clone lambda J 19). 

Figure 3 generally depicts a restriction map of the HIV-2 
ROD genome and its homology to HIV-1. Figure 3A specifically 
depicts the organization of three recombinant phage lambda 
clones, ROD 4, ROD 27, and ROD 35. In Figure 3A, the open boxes 



represent viral sequences, the LTR are filled, and the dotted 
boxes represent cellular flanking sequences (not mapped). Only 
some characteristic restriction enzyme sites are indicated. 
XrOD 27 and /Xrod 35 are derived from integrated proviruses 
while ^ROD 4 is derived from a circular viral DNA. The portion 
of the lambda clones that hybridzes to the cDNA E2 is indicated 
below the maps. A restriction map of the ^>ROD isolate was re- 
constructed from these three lambda clones. In this map, the re- 
striction sites are identified as follows: B: BamH I ; E: EcoRI ; 
H: Hindi 1 1 ; K: K£Ql ; Ps : PstI; Pv: PvuII; S: SacI; X: Xbal . 
R and L are the right and left BamH I arms of the lambda L47.1 
vector . 

Figure 3B specifically depicts dots 1-11 which correspond to 
the single-stranded DNA form of M13 subclones from the HlV-l^y 
cloned genome ( ^J19). Their size and position on the HIV-1 
genome, determined by sequencing is shown below the figure. 
Dot 12 is a control containing lambda phage DNA. The dot-blot 
was hybridized in low stringency conditions as described in 
Example 1 with the complete lambda ^vROD 4 clone as a probe, and 
successively washed in 2x SSC, 0.1% SDS at 25°C. (Tm -42°C), Ix 
SSC, 0.1% SDS at 60°C. (Tm -20°C), and O.lx SSC, 0.1% SDS at 
60°C. (Tm -3°C) and exposed overnight. A duplicate dot blot was 
hybridized and washed in stringent conditions (as described in 
Example 2) with the labelled lambda J19 clone carrying the com- 
plete HIV-Iqru genome. HIV-1 and HIV-2 probes were labelled the 
same specific activity (10^ cpm/ g.). 
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Figure 4 generally depicts the restriction map polymorphism 
in different HIV-2 isolates and shows comparison of HIV-2 to SIV. 
Figure 4A specifically depicts DNA (20 ug. per lane) from CEM 
cells infected by the isolate HIV-2qql (panel 1) or peripheral 
blood lymphocytes (PBL) infected by the isolates HIV-2 GO m 
(panel 2) and HIV-2 MIR (panel 3) digested with: EcoRI (a), PstI 
(b), and Hind lll (c). Much less viral DNA was obtained with 
HIV-2 isolates propagated on PBL. Hybridization and washing were 
in stringent conditions, as described in Example 2, with 10 6 
cpm/ml. of each of the E2 insert ( cDNA ) and the 5 kb. Hindi I I 
fragment of \rOD 4, labelled to 10 9 cpm/ug. 

Figure 4B specifically depicts DNA from HUT 78 (a human T 
lymphoid cell line) cells infected with STLV3 MAC isolate Mm 
142-83, The same amounts of DNA and enzymes were used as indi- 
cated in panel A. Hybridization was performed with the same 
probe as in A, but in non-stringent conditions. As described in 
Example 1 washing was for one hour in 2x SSC, 0.1% SDS at 40°C 
(panel 1) and after exposure, the same filter was re-washed in 
O.lx SSC, 0.1% SDS at 60°C. (panel 2). The autoradiographs were 
obtained after overnight exposition with intensifying screens. 

Figure 5 depicts the position of derived plasmids 
from )vROD 27, \r0D 3 5 and \rOD 4. 

Detailed Description of the Preferred Embodiments 

Reference will now be made in detail to the presently pre- 
ferred embodiments of the invention, which, together with the 
following examples, serve to explain the principles of the 
invent ion. 
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The genetic structure of the HIV-2 virus has been analyzed 
by molecular cloning according to the method set forth herein and 
in the Examples. A restriction map of the genome of this virus 
is included in Figure 4. In addition, the partial sequence of a 
cDNA complementary to the genomic RNA of the virus has been 
determined. This cDNA sequence information is included in 
Figure 1. 

Also contained herein is data describing the molecular 
cloning of the complete 9.5 kb genome of HIV-2, data describing 
the observation of restriction map polymorphism between different 
isolates, and an analysis of the relationship between HIV-2 and 
other human and simian retroviruses. From the totality of these 
data, diagnostic probes can be discerned and prepared. 

Generally, to practice one embodiment of the present inven- 
tion, a series of filter hybridizations of the HIV-2 RNA genome 
with probes derived from the complete cloned HIV-1 genome and 
from the gag and pol genes were conducted. These hybridizations 
yielded only extremely weak signals even in conditions of very 
low stringency of hybrization and washing. Thus, it was found to 
be difficult to assess the amount of HIV-2 viral and proviral DNA 
in infected cells by Southern blot techniques. 

Therefore, a complementary DNA (cDNA) to the HIV-2 genomic 
RNA initially was cloned in order to provide a specific hy- 
bridization probe. To construct this cDMA , an oligo (dT) primed 
cDNA first-strand was made in a detergent-activated endogenous 
reaction using HIV-2 reverse transcriptase with virions purified 
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from supernatants of infected CEM cells. The CEM cell line is a 
lymphoblastoid CD4+ cell line described by G.E. Foley et al. in 
Cancer 18; 522-529 (1965), specifically incorporated herein by 
reference. The CEM cells used were infected with the isolate ROD 
and were continuously producing high amounts of HIV-2. 

After second-strand synthesis, the cDNAs were inserted into 
the M 13 tg 130 bacteriophage vector. A collection of 10 4 M13 
recombinant phages was obtained and screened in situ with an 
HIV-1 probe spanning 1.5 kb. of the 3' end of the LAV B ru isolate 
(depicted in Figure 1A) . Some 50 positive plaques were detected, 
purified, and characterized by end sequencing and cross- 
hybridizing the inserts. This procedure is described in more 
detail in Example 1 and in Figure 1. 

The different clones were found to be complementary to the 
3' end of a polyadenylated RNA having the AATAAA signal about 20 
nucleotides upstream of the poly A tail, as found in the long 
terminal repeat (LTR) of HIV-1. The LTR region of HIV-1 has been 
described by S. Wain Hobson et al . in Cell 40: 9-17 (1985), spe'- 
cifically incorporated herein by reference. The portion of the 
HIV-2 LTR that was sequenced was related only distantly to the 
homologous domain in HIV-1 as demonstrated in Figure 1 B. In- 
deed, only about 50% of the nucleotides could be aligned and 
about a hundred insertions/deletions need to be introduced. In 
comparison, the homology of the corresponding domains in HIV-1 
isolates from USA and Africa is greater than 95% and no inser- 
tions or deletions are seen. 



-11- 



The largest insert of this group of Ml 3 clones was a 2 kb. 
clone designated E2. Clone E2 was used as a probe to demonstrate 
its HIV-2 specificity in a series of filter hybridization experi- 
ments. Firstly, this probe could detect the genomic RNA of HIV-2 
but not HIV-1 in stringent conditions as shown in Figure 2, C and 
D. Secondly, positive signals were detected in Southern blots of 
DNA from cells infected with the ROD isolate as well as other 
isolates of HIV-2 as shown in Figure 2, A and Figure 4 , A. No 
signal was detected with DNA from uninfected cells or HIV-1 in- 
fected cells, confirming the exogenous nature of HIV-2. In 
undigested DNA from HIV-2 infected cells, an approximately 10 kb. 
species, probably corresponding to linear unintegrated viral DNA, 
was principally detected along with a species with an apparent 
size of 6 kb., likely to be the circular form of the viral DNA. 
Conversely, rehybr idizat ion of the same filter with an HIV-1 
probe under stringent conditions showed hybridization to HIV-1 
infected cells only as depicted in Figure 2, B. 

To isolate the remainder of the genome of HIV-2 , a genomic 
library in lambda phage L47.1 was constructed. Lambda phage 
L47.1 has been described by W.A.M. Loenen et al. in Gene 10: 
249-259 (1980), specifically incorporated herein by reference. 
The genomic library was constructed with a partial Sau 3AI re- 
striction digest of the DNA from the CEM cell line infected with 
HIV-2 R0D . 

About 2 X 10 6 recombinant plaques were screened in situ with 
labelled insert from the E2 cDNA clone. Ten recombinant phages 



-12- 



were detected and plaque purified. Of these phages, three were 
characterized by restriction mapping and Southern blot hy- 
bridization with the E2 insert and probes from its 3' end (LTR) 
or 5' end (envelope), as well as with HIV-1 subgenomic probes. 
In this instance, HIV-1 probes were used under non-stringent con- 
ditions . 

A clone carrying a 9.5 kb. insert and derived from a circu- 
lar viral DNA was identified as containing the complete genome 
and designated ^ROD 4. Two other clones, )\ROD 27 and ^ROD 35 
were derived from integrated proviruses and found to carry an LTR 
and cellular flanking sequences and a portion of the viral coding 
sequences as shown in Figure 3, A. 

Fragments of the lambda clones were subcloned into a plasmid 
vector p UC 18. 

Plasmid pROD 27-5 1 is derived from ^ROD 27 and contains the 
5 T 2Kb of the HIV-2 genome and cellular flanking sequences (5 T 
LTR and 5 f viral coding sequences to the Sco RI site) 

Plasmid p ROD 4-8 is dervied from ^ ROD 4 and contains the 
about 5Kb Hindi 1 1 fragment that is the central part of the HIV-2 
genome . 

Plasmid pROD 27-5 1 and p ROD 4.8 inserts overlap. 

Plasmid pROD 4.7 contains a Hindi 1 1 1 . 8 Kb fragment from 
XrOD 4. This fragment is located 3 T to the fragment subcloned 
into pROD 4.8 and contains about 0.8 Kb of viral coding sequences 
and the part of the lambda phage ( ^L47.1) left arm located 
between the Bam Hl and Hindi 1 1 cloning sites. 
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Plasmid pROD 35 contains all the HIV- 2 coding sequences 3' 
to the EcoRI site, the 3' LTR and about 4 Kb of cellular flanking 
sequences. 

Plasmid pROD 27-5 f and pROD 35 in E. coli strain HB 101 are 
deposited respectively under No. 1-626 and 1-633 at the CNCM f and 
have also been deposited at the NCIB (British Collection). These 
plasmids are depicted in Figure 5. Plasmids pROD 4-7 and pROD 
4-8 in E. coli strain TGI are deposited respectively under 
No. 1-627 and 1-628 at the CNCM. 

To reconstitute the complete HIV- 2 ROD genome , pROD 35 is 
linearized with Eco RI and the Eco RI insert of pROD 27-5 1 is 
ligated in the correct orientation into this site. 

The relationship of HIV- 2 to other human and simian ret- 
roviruses was surmised from hybridization experiments. The rela- 
tive homology of the different regions of the HIV-1 and 2 genomes 
was determined by hybridization of fragments of the cloned HIV-1 
genome with the labelled ^ROD 4 expected to contain the complete 
HIV-2 genome (Figure 3, B) . Even in very low stringency condi- 
tions (Tm-42°C), the hybridization of HIV-1 and 2 was restricted 
to a fraction of their genomes, principally the gag gene (dots 1 
and 2), the reverse transcriptase domain in pol (dot 3), the end 
of pol and the Q (or sor) genes (dot 5) and the F gene (or 3' 
orf) and 3' LTR (dot 11). The HIV-1 fragment used to detect the 
HIV-2 cDNA clones contained the dot 11 subclone, which hybridized 
well to HIV-2 under non-stringent conditions. Only the signal 
from dot 5 persisted after stringent washing. The envelope gene, 
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the region of the tat gene and a part of pol thus seemed very 
divergent- These data, along with the LTR sequence obtained 
(Figure 1, B) f indicated that HIV-2 is not an envelope variant of 
HIV-1, as are African isolates from Zaire described by Alizon et 
al. , Cell 40 :63-74 (1986). 

It was observed that HIV-2 is related more closely to the 
Simian Immunodeficiency Virus (SIV) than it is to HIV-1, This 
correlation has been described by F. Clavel et al. in C.R. Acad. 
Sci. (Paris) 302 ; 485-488 (1986) and F. Clavel et al . in Science 
233 : 343-346 (1986), both of which are specifically incorporated 
herein by reference. Simian Immunodeficiency Virus (also desig- 
nated Simian T-cell Lymphotropic Virus Type 3, STLV-3) is a ret- 
rovirus first isolated from captive macaques with an AIDS-like 
disease in the USA. This simian virus has been described by M.D. 
Daniel et al. in Science 228: 1201-1204 (1985), specifically in- 
corporated herein by reference. 

All the SIV proteins, including the envelope, are immune 
precipitated by sera from HIV-2 infected patients, whereas the 
serological cross-reactivity of HIV-1 to 2 is restricted to the 
core proteins. However SIV and HIV-2 can be distinguished by 
slight differences in the apparent molecular weight of their pro 
teins. 

In terms of nucleotide sequence, it also appears that HIV-2 
is closely related to SIV. The genomic RNA of SIV can be 
detected in stringent conditions as shown in Figure 2, C by HIV- 
probes corresponding to the LTR and 3' end of the genome (E2) or 



to the gag or qo! genes. Under the same conditions, HIV-1 
derived probes do not detect the SIV genome as shown in Figure 2, 
D. 

In Southern blots of DNA from SIV-infected cells, a restric- 
tion pattern clearly different from HIV-2 R0D and other isolates 
is seen. All the bands persist after a stringent washing, even 
though the signal is considerably weakened, indicating a sequence 
homology throughout the genomes of HIV-2 and SIV. It has re- 
cently been shown that baboons and macaques could be infected 
experimentally by HIV-2, thereby providing an interesting animal 
model for the study of the HIV infection and its preventive ther- 
apy. Indeed, attempts to infect non-human primates with HIV-1 
have been successful only in chimpanzees, which are not a conve- 
nient model. 

From an initial survey of the restriction maps for certain 
of the HIV-2 isolates obtained according to the methods described 
herein, it is already apparent that HIV-2, like HIV-1, undergoes 
restriction site polymorphism. Figure 4 A depicts examples of- 
such differences for three isolates, all different one from 
another and from the cloned HIV-2 RO d- It is very likely that 
these differences at the nucleotide level are accompanied by 
variations in the amino-acid sequence of the viral proteins, as 
evidenced in the case of HIV-1 and described by M. Alizon et al. 
in Cell 46: 63-74 (1986), specifically incorporated herein by 
reference. It is also to be expected that the various isolates 
of HIV-2 will exhibit amino acid heterogeneities. See, for 
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example, Clavel et al .. Nature 324 (l8):691-695 (1986), specifi- 
cally incorporated herein by reference. 

Further, the chacter izat ion of HIV-2 will also delineate the 
domain of the envelope glycoprotein that is responsible for the 
binding of the surface of the target cells and the subsequent in- 
ternalization of the virus. This interaction was shown to be me- 
diated by the CD 4 molecule itself in the case of HIV-1 and simi- 
lar studies tend to indicate that HIV-2 uses the same receptor. 
Thus, although there is wide divergence between the env genes of 
HIV-1 and 2, small homologous domains of the envelopes of the two 
HIV could represent a candidate receptor binding site. This site 
could be used to raise a protective immune response against this 
group of retroviruses. 

From the data discussed herein, certain nucleotide sequences 
have been identified which are capable of being used as probes in 
diagnostic methods to obtain the immunological reagents necessary 
to diagnose an HIV-2 infection. In particular, these sequences 
may be used as probes in hybridization reactions with the genetic 
material of infected patients to indicate whether the RNA of the 
HIV-2 virus is present in these patient's lymphocytes or whether 
an analogous DNA is present. In this embodiment, the test meth- 
ods which may be utilized include Northern blots, Southern blots 
and dot blots. One particular nucleotide sequence which may be 
useful as a probe is the combination of the 5 kb. Hindi 1 1 frag- 
ment of ROD 4 and the E2 cDNA used in Figure 4. 
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In addition, the genetic sequences of the HIV-2 virus may be 
used to create the polypeptides encoded by these sequences. Spe- 
cifically, these polypeptides may be created by expression of the 
cDNA obtained according to the teachings herein in hosts such as 
bacteria, yeast or animal cells. These polypeptides may be used 
in diagnostic tests such as immunofluorescence assays (IFA), ra- 
dioimmunoassays (RIA) and Western Blot tests. 

Moreover, it is also contemplated that additional diagnostic 
tests, including additional immunodiagnost ic tests, may be 
developed in which the DNA probes or the polypeptides of this 
invention may serve as one of the diagnostic reagents. The 
invention described herein includes these additional test meth- 
ods . 

In addition, monoclonal antibodies to these polypeptides or 
fragments thereof may be created. The monoclonal antibodies may 
be used in immunodiagnost ic tests in an analogous manner as the 
polypeptides described above. 

The polypeptides of the present invention may also be used 
as immunogenic reagents to induce protection against infection by 
HIV-2 viruses. In this embodiment, the polypeptides produced by 
recombinant-DNA techniques vould function as vaccine agents. 

Also, the polypeptides of this invention may be used in com- 
petitive assays to test the ability of various antiviral agents 
to determine their ability to prevent the virus from fixing on 
its target. 
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Thus f it is to be understood that application of the teach- 
ings of the present invention to a specific problem or environ- 
ment will be within the capabilities of one having ordinary skill 
in the art in light of the teachings contained herein. Examples 
of the products of the present invention and representative pro- 
cesses for their isolation and manufacture appear above and in 
the following examples, 

EXAMPLES 

Example 1 : Cloning of a cDNA Complementary to 

Genomic RNA From HIV-2 virions 

HIV-2 virions were purified from 5 liters of supernatant 
from a culture of the CEM cell line infected with the ROD isolate 
and a cDNA first strand using oligo (dT) primer was synthesized 
in detergent activated endogenous reaction on pelleted virus, as 
described by M. Alizon et al , in Nature, 312: 757-760 (1984), 
specifically incorporated herein by reference. RNA-cDNA hybrids 
were purified by phenol-chloroform extraction and ethanol precip- 
itation. The second-strand cDNA was created by the DNA 
polymerase I/RNAase H method of Gubler and Hoffman in Gene, 25: 
263-269 (1983), specifically incorporated herein by reference, 
using a commercial cDNA synthesis kit obtained from Amersham. 
After attachment of EcoRI linkers (obtained from Pharmacia), 
EcoRI digestion, and Ligation into Eco RI -digested 
dephosphorylated M13 tg 130 vector (obtained from Amersham) , a 
cDNA library was obtained by transformation of the E. coli TGI 
strain. Recombinant plaques (10 4 ) were screened in situ on rep- 
lica filters with the 1.5 kb, Hin di 1 1 fragment from clone J19, 



-19- 



corresponding to the 3' part of the genome of the LAV BRU isolate 
of HIV-1, 32 P labelled to a specific activity of 10 9 cpm ug. The 
filters were prehybr idized in 5 x SSC, 5 x Denhardt solution, 25% 
formamide, and denatured salmon sperm DNA (100 ug/ ml.) at 37 °C. 
for 4 hours and hybridized for 16 hours in the same buffer (Tm 
-42°C.) plus 4 x 10 7 cpm of the labelled probe (10 6 cpm/ml. of 
hybridization buffer). The washing was done in 5 x SSC, 0.1% SDS 
at 25°C. for 2 hours. 20 x SSC is 3M NaCl, 0.3M Na citrate. 
Positive plaques were purified and single-stranded M13 DNA pre- 
pared and end-sequenced according to the method described in 
Proc. Nat'l. Acad. Sci. USA, 74: 5463-5467 (1977) of Sanger et 
al. 

Example 2 ; Hybridization of DNA from HIV-1 and 

HIV-2 Infected Cells and RNA from HIV-1 
and 2 and SIV Virons With a Probe 
Derived From an HIV-2 Cloned cDNA 

DNA was extracted from infected CEM cells continuously pro- 
ducing HIV-1 or 2. The DNA digested with 20 ug of PstI digested 
with or undigested, was electrophoresed on a 0.8% agarose gel, 
and Southern-transferred to nylon membrane. Virion dot-blots 
were prepared in duplicate, as described by F. Clavel et al . in 
Science 233 ; 343-346 (1986), specifically incorporated herein by 
re f ere nce, by pelleting volumes of supernatant corresponding to 
the same amount of reverse transcriptase activity. 

Prehybr idizat ion was done in 50% formamide, 5 x SSC, 5 x Denhardt 
solution, and 100 mg./ml. denatured salmon sperm DNA for 4 hours 
at 42°C* Hybridization was performed in the same buffer plus 10% 
Dextran sulphate, and 10 6 cpm/ml. of the labelled E2 insert 
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(specific activity 10 9 cpm/ug.) for 16 hours at 42°C. Washing 

was in 0.1 x SSC r 0.1% SDS for 2 x 30 mn. After exposition for 

16 hours with intensifying screens, the Southern blot was 

dehybridized in 0.4 N NaOH, neutralized, and rehybridized in the 

same conditions to the HIV-1 probe labelled to 10 9 cpm/ug. 

Example 3 ; Cloning in Lambda Phage of the 

Complete Provirus DNA of HIV-2 

DNA from the HIV-2rqd infected CEM (Figure 2, lanes a and c) 
was partially digested with Sau 3AI . The 9-15 kb. fraction was 
selected on a 5-40% sucrose gradient and ligated to BamH I arms of 
the lambda L47.1 vector. Plaques (2 x 10 6 ) obtained after in 
vitro packaging and plating on E. coli LA 101 strain were 
screened in situ with the insert from the E2 cDNA clone. Approx- 
imately 10 positive clones were plaque purified and propagated on 
E, coli C600 recBC. The ROD 4, 27, and 35 clones were ampli- 
fied and their DNA characterized by restriction mapping and 
Southern blotting with the HIV-2 cDNA clone under stringent con- 
ditions, and qaq-pol probes from HIV-1 used under non stringent 
conditions. 



-21- 



Example 4 ; Complete Genomic Sequence of 
the ROD HIV-2 Isolate 

Experimental analysis of the HIV-2 ROD isolate yielded the 
following sequence which represents the complete genome of this 
HIV-2 isolate. Genes and major expression products identified 
within the following sequence are indicated by nucleotides num- 
bered below: 

1) GAG gene (546-2111) expresses a protein product having 
a molecular weight of around 55Kd and is cleaved into the follow- 
ing proteins: 

a) p 16 (546-950) 

b) p 26 (951-1640) 

c) p 12 (1701-2111) 

2) polymerase (1829-4936) 

3) Q protein (4869-5513) 

4) R protein (5682-5996) 

5) X protein (5344-5679) 

6) Y protein (5682-5996) 

7) Env protein (6147-8720) 

8) F protein (8557-9324) 

9) TAT gene (5845-6140 and 8307-8400) is expressed by two 
exons separated by introns. 

10) ART protein (6071-6140 and 8307-8536) is similarly the 
expression product of two exons. 

11) LTR : R (1-173 and 9498-9671) 
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12) U5 (174-299) 
-13) U3 (8942-9497) 

It will be known to one of skill in the art that the 
absolute numbering which has been adopted is not essential. For 
example, the nucleotide within the LTR which is designated as "1" 
is a somewhat arbitrary choice. What is important is the se- 
quence information provided. 
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GGTCGCTCTGCGGAGAGGCTGGCAGATTGAGCCCTGGGAGCTTCTCTCCAGCACTAGCAG 
»••«•• 
GTAGAGCCTGGGTCXTCCCTGCTAGACTCTCACCAGCACTTGGCCGGTGCTGGGCAGACG 

100 

GCCCCACGCTTGCTTGCTTAAAAACCTCTTAATAAAGCTCCCACTTAGAAGCAAGfTAAG 

TGXGTGCTCCCATCTCtCCTAGTCGCCCCCTGCTCATTCGGTGITCACCTGAGTAACAAG 
. 200 .... 

ACCCTGGTCTGTTAGGACCCTTCTTGCTTTGGGAAACCGAGGCAGGAAAATCCCIAGCAG 
. . . . . 300 

GTTGGCGCCTGAACAGGGACTTGAAGAAGACIGACAAGTCTTGGAACACGGCTGAGTGAA 

GGCAGTAAGCCCGGCACGAACAAACCACGACGGAGTGCTCCTAGAAACGCGCGOCCCGAG 

. 400 • . 

GTACCAAAGGCAGCCTCTGGACCCGGAGGAGAAGAGGCCTCCCGGTGAACGTAAGTACCT 

I ACACCAAAAACTGTAGCCGAAACGCCTTGCTATCCTACCTTTAGACACGTAGAAGATTCT 
J 500 ... 

1 MetClyAl*ArgA*nSerV« lLeuArgG lyLy«ly»Al*A*pGluLeuCluArgIlt 

J GGGAGAXGGGCGCGAGAAACTCCCTCTTGAGAGGGAAAAAAGCAGATGAATTAGAAAGAA 

• • • * . 600 

ArgLeuArg?roGlyGlyLy«Ly*Ly»TyrArgL«uLysHitIieV«lTrpAUAUAtn 
TCAGGTTACGGCCCGGCCGAAAGAAAAAGTACAGGCTAAAACATATTGTGTGCGCAGCCA 

Ly»L«uAgpArgPhaGlyLauAlaGluS*rLtuLeuGluS«rLyaGluGlyCyaGlfiLyt 
AT A AATT GG AC A GATTCGC ATT AGCAGAGAGCCTGTTGGAGTC A AAAGAGGGTTGTC AAA 
« ♦ * 700 # « 

1 1 cLauThr Va lLauAipP roMe C Va IPr oThrG ly SarG LuAanL«uI*y«StrL«u?he 
AAATTCTTACAGTTTTAGATCCAATGGTACCGACAGGTTCAGAAAATTTAAAAAGTCTTT 

• *••♦♦ 
AitiThrV* LCy» ValIl«TrpCyfIItHi«AUGluGluLy»V* lLy»A»pThrGluGly 

TTAATACTGTCTGCG7CATTTGGTGCATACACGCAGAAGAGAAAGTGAAAGATAGTGAAG 
" "*O0 

AlaLysGlnIleV«lArgArgHi$LeuValAlaGluThrGlyThrAXaGluLy»Mcc?ro 
GAGCAAAACAAATAGTGCGCAGACATCTAGTGGCAGAAACAGGAACTGCAGAGAAAATGC 

900 

SerThr8«rArgProThrAl*ProSerS«rGluLytGlyGlyA«aTyrProV*lGlQHi* 
CAAGCACAAGTAGACCAACAGCACCATCTACCGAGAAGGGAGCAAATTACCCAGTGCAAC 
«•♦*•• 
ValGlyGlyAinTyrThrHia HeProL«uSerProArgThrLtuAflnAlaTrpVgiLy» 
ATGTAGGCGGCAACTACACCCATATACCGCTGACTCCCCGAACCCTAAATGCCTGGGTAA 

1000 

lauVa lGluCluLysLysPhtClyAlmGUValValProG lyPheG InAlaLauSerG lu 
AATTAGTAGAGCAAAAAAAGTTCGGGGCACAAGTAGTGCCACGATTTCACCCACTCTCAG 
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CiyCyiThrProTyrAipILcA»ttClnM«tL«»A«aCyiV«lClyAipBi*ClaAUAl« 
AACCCTCCACCCCCIATCATATCAACCAAATGCTTAATTCTCTCCCCCACCATCAACCAO 

• 1100 • • • * 

KetClnIl#Il«ArgGluIU_Il«AflnClaCluAi4AiaGiuIrpAspVA iUlnLis: J ru 
CCAXGCAGAtAAXCAGGCACAXXAXCAATGACGAAGCAOCACAAXGGGATGXGCAACATC 

1200 

Il«ProGlyProZ,«uProAUGlyCUL«uArfCUProArgClyS«rA*plUAUCly 
CAATACCAGGCCCCTTACCAGCCGGGCAGCTTAGAGAGCCAAGGGGATCTCACATACCAC 

ThrThrS*rThrVdLGiuGluClnIltClnTrpMttPh«ArgProClnA»nProV*XPro 
GGACAACAAGCACAGTAGAAGAACAGATCCAGTGGATGXTTAGGCCACAAAATCCTGTAC 

• . . 1300 « 
VAlGlyA#nIleTyrArgArgTrpll«Gltt21tGlyUuGlnLy#Cy«V«lArgKttTyr 

CACXAGGAAACAICXAIAGAACATGGAXCCAGAXAGGAXXGCAGAAGXGXCXCAGGAXGX 

• *♦••• 
A*nProTiirA*nIl«L«uAf p IliLysGUC lyProLyiGluProPbtGinStrTyr V« 1 

ACAACCCGACCAACATCCTAGACA7AAAACAGGGACCAAAGGAGCCGTTCCAAAGCTA7G 
1400 • 

A«pArgPh«TyrLyi S«rL«uArgAUGluGlnThrA*pProAUV*lLy§A»txTrpM«t 
XAGAXAGAXXCXACAAAAGCIXGAGCGCAGAAGAAACACATCCAGCAGXGAAGAAXXGGA 

• ♦ ♦ • « 1S00 
ThrGlnTbrL«uL«uV4lGinAinAlAA»nProA«pCyiLy»L«uVtlL«uLy»ClyL#u 

TGACCCAAACACTGCTAGTACAAAATGCCAACCCACACTGTAAATTAGTCCTAAAAGGAC 

GlyM«tA*»ProXhrLeuG luG luMetttuXhrAUCyiGlnGly V*lGlyClyProCly 
TACCCAXGAACCCTACCXXAGAAGAGAXCCXGACCGCCXGXCAGGCCGXAGCXGGGCCAG 
- • . 1600 • « 

GlnLy«AUArgL«uMetAUGluAl*L«uLytGluV«lIl«GlyProAl*ProIl«Pro 
GCCAGAAAGCXAGAXXAATCGCAGAGCCCCXGAAAGAGGXCAXAGGACCIGCCCCXAXCC 
•**•«• 
Ph*Al*AUAUGUGUArgLy*AUPhtLy«Cy»TrpAinCy#GlyLy*GluGlyHi» 
CAXXCGCACCAGCCCAGCAGAGAAAGGCAXXXAAAXGCXGGAACXCXGCAAAGCAAOGGC 
1700 . 

SerAl«ArgGlnCyiArgAUProArgArgGlnGlyCyiXrpLy«Cy»GlyLy«ProGly 
ACXCGGCAAGACAAXGCCGAGCACCXAGAAGGCAGGGCXGCXGGAAGXGXGGXAAGCGAG 

* . « 1S001 

7hrGlyArgPh«PbtArgXhrGlyProL«uGly 
HigIl«M«tThrA»iiCy«ProA*pArgGlnAl*GiyPbeL«uGlyL«uGlyProXrpGly 
GACACAXCAXGACAAACTCCCCAGATAGACAGGCAGGXXXXXXAGGACTCGGCCCXXGGC 

LysGluAUProGlaL«uProArgGlyPro$«rS«rAUClyAi«A«pXhrAtn8€rXhr 
LytLyfProArgA«nPhtProValAUGXnVtlProGinGlyL«uThrProXUrAl*Pro 
GAAAGAAGCCCCGCAACXXCCCCGXGGCCCAAGXXCCCCAGGGGCXGACACCAACAGCAC 

1900 

ProS«rGly8«rS#rS«rGlyStrXhrGlyCluIliXyrAl*Al*ArgGluLyiXhrGlu 
Pro?tlA«pProAUV*lAipLeuL«uGluLy«XyrMetGlnGloGlyLy«ArgGlnArg 
CCCCAGXGGAXCCAGCAGXCCAXCXACXGGAGAAAXAXATCCAGCAACCGAAAAGACAGA 
♦*•••« 
ArgAl«GluArgGluThrIleClnClySerA«pArgGlyLtuXhrAl«ProArgAUGly 
GluGlBArgGluArgProXyrLy»GluV*lXhrGluAtpLeuL*uai§L«uGluClaGly 
GACAGCAGAGACACAGACCAXACAAGGAAGXCACACAGGACXXACXCCACCXCGAGCAGG 
2000 , 

GlyA«pXhrIi«GlnGlyAl«XhrAiaArgGlyL«uAlgAlaProGinPht««rL«uXrp 
GluXhrProXyrArgCLtiProProXhrGtuA*pL«uL€uHi«L«uA*nS€rttuPh«Gly 
CGCAGACACCAXACAGCGAGCCACCAACACAGGACXXCCXGCACCXCAAXXCXCXCXXXC 

♦ 2100 
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LyaAr|~ProV*l?AlThrAlATyrIltGluCiyCiaProV«lGluV*lL«uL«uAipThr 
Ly«A«pCln 

caaaagaccagtactcacagcatacaitgacggtcagccagtagaagtcttcttagacac 

• ••••• 

GlyAl*A«pAfpS«rIleV4lAl*GlyIltGluLeu01yAsaA»oTyr8«rProLy«Iie 
AGGGGCTGACGACTCAATAGTAGCAGGAATAGAGTTAGGGAACAATTATAGCCCAAAAAT 

2200 

V* lClyClyIleGlyGlyPhelltA»oThrLy«G luTyrLy»AtnV«lGluIieCluV«l 
AGTAGCGGGAATAGGGGGATTCATAAATACCAAGGAATATAAAAATGTAGAAATAGAAGT 

TCTAAATAAAAAGCTACCGCCCACCATAATGACAGGCGACACCCCAATCAACATTTTTGG 

2300 .... 

ArgA«nI leL«uThrAl*ie\iG lyliet S«r L«uA*nL6uProVa 1A1 *Lys ValCluPro 

cagaaaxaxtcxgacagccxiaggcaxgxcatxaaaictaccagtcgccaaagiacagcc 

• • 2400 

Z l«Ly* I leMetL«uLy*ProGlyLy»A»pGlyProLy«L#uArgC lnTrpProL«uThr 

aataaaaataatgctaaaggcaggcaaacatggaccaaaactgagacaatggcccttaac 

• ••«•« 

Ly«GluLy« I leG luAlaL«uLy»CluIltCyiCluLy*M«tCluLytCluClyGlnLeu 
AAAAGAAAAAATAGAAGCACTAAAAGAAATCTGTGAAAAAATGGAAAAAOAAGGCCAGCT 

2500 

GluC luAl*ProProThrA»nProTyrA*nThrProThrPheAl«Il«LysLyfLy»A$p 
AGACGAAGCACC7CCAACTAATCCTTATAATACCCCCACATTTGCAATCAAGAAAAAGGA 

• • ♦ • • • 
LysAf nLysTrpArgMe t Leu I i «A§pPh« ArgC luL t uA»n Ly • V* lThrGlaA»pPhe 

CAAAAACAAATGGAGGATGCTAATACATTTCAGACAACTAAACAAGGTAACTCAAGATTT 

2600 • 

ThrGluIl«GlnLeuGlyI leProHi$ProAlaGlyLeuAl*LytLy«ArgAr t I l«Tbr 
CACAGAAATTCAGTTAGGAA7TCCACAGGGAGCAGGG7TGGCCAAGAAGAGAAGAATTAG 

• 2700 
V* 1L<suAi P V« LG ly At pAltTyrPht St r I leProLeaHiiGluAipPhtArgProTyr 

TCTACTAGATGTAGGGGATGCTTACTTTTCCATACCACTACAIGACGACTTTAGACCATA 

• * » * • • 
ThrAl APheTbr L«uProS«rV* lAioA#nAlAC luProG lyLyiArgTyr I l«Ty rLy 6, 

TACTGCATTTACTCTACCATCAGICAACAATGCAGAACCAGGAAAAACATACATATATAA 

2800 

V« lL«uProGlnGlyTrpLyiGlyS«rProAl«ZlePhcGlnBiiTbrMgtAfgGlnVa 1 
AGTCTTGCCACACCCATGGAAGGGATCACCAGCAATTTTTCAACACACAATGAGACACGT 

• • # • • • 
LtuGluProPh«ArgLy«Al«A§nLy»A*pV*lI 1«I 1«I 1«G loTyrM«t A»pA»p I l« 

ATTACAACCATTCAGAAAAGCAAACAAGGATCTCATTATCATTCAGTACATGGATGATAT 

• 2900 ♦ . 
LauIltAl tSerA«pArgXhrA*pLeuG lufiitAspArg Val V* lLtuG laLtuLyiG lu 

CXIAAXAGCXAGXGACAGGACAGAXXXAGAACAXGAXAGGGXAGXCCXGCAGCXCAAGCA 

* . • Jooo 

LtuLeuAsnG lyLeuG lyPh«S«rXhrProAf pG luLy«PhtG lnLy»A*pProProTy r 
ACXXCXAAAXGGCCXAGGAXXXXCTACCCCACATGACAAGXXCCAAAAAGACCCXCCAXA 

• • # # • • 
HisTrpM«tGlyTyrGluLeuTrpProThrly§TrpLy»L«uClnLyt IleGlnL«uPro 

CCACTGGATCCGCTATGAACTATGGCCAACTAAATGGAAGTTGCAGAAAATACAGTTGCC 

3100 

GlnLy»CluIl«TrpThrV«lA»nA«pIl«GlnLy«L«uV»iClyValL«uAioTrpAl« 
CCAAAAAGAAATATCGACACTCAATGACATCCACAAGCTAGTGCCTGTCCTAAATTGGGC 
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Al4GlnL«uTyrProClyIi«I.y«ThrLytHiiL«uCyiArgl.«uIUArgClyLy»K«t 
AGCACAACTCTACCCAGGGATAAAGACCAAACACTTATGTAGGTTAATCAGACOAAAAAT 

3200 .... 
TbrL««ThrGiuGluV*lClBlrpThrGiuL«uAl«GluAl*GluL«uGluCluA«nArg 
CACACTCACAQAAGAAGTACAGTCCACAGAATTACCAGAACCAGAGCTACAACAAAACAC 

3300 

Il«Xl«L«o8«rGlnCluCUCluGlyHi»TyrTyrGlnGluClulyiCiuL«uCluAX« 
AATTATCCTAACCCAGGAACAAGAGGGACACTATTACCAAGAAGAAAAAGACCTACAACC 

• • * • « • 

TbrV«lGloLy*A«pGloGlaA«nGlnTrpThrTyrLy«Zl*Ri*GlaGluClttLytll< 
AACAGTCCAAAAGGATCAAGAGAATCAGTCGACATATAAAATACACCAGCAAGAAAAAAT 

3400 

L«uLy«ValGlyLy»TyrAl«Ly»V«lLytA*nTbrHiftbrAtoGlyll«ArgL«uL«u 
TCTAAAAGTAGCAAAATATGCAAAGGTCAAAAACACCCATACCAATGGAATCAGATTGTT 

AUGlnV* IVjlClaLy* UtClyLy»GiuAULeuV«Ill«TrpGlyArgIl«ProLyi 
AGCACACGTAGTTCAGAAAATAGGAAAACAAGCACTAGTCATTTGGGGACCAATACCAAA 
. 3500 .... 

Ph«Hi$L«uProV»lGiuArgCluIl«TrpGluGlnTrpTrpA»pAaqIyrTrpCXnV»l 
ATTTCACCTACCAGTAGAGAGAGAAATCTGGGAGCAGTGGIGGGATAACTAGTGGCAAGT 

3400 

ThrTrpil«ProAapTrpAipPh«V«U«rThrFroProLtuV«UpgJ»«uAUPbtA«ft 

GACAIGGATCCCAGACTGGGACTTCGTCTCTACCCCACCACIGCTCACGTIAOCGTTTAA 

LauV*lGlyA8pProIl«ProCiyAlaGluXhrPh«TyrTbrAipGlySarCy«A«nArji 
CCIGCTAGGCGATCCTATACCAGGTGCACAGACCTTCTACACAGATGGATCCTGCAATAG 

3700 . . 

ClnS«rLy«GluGlyLy»AUGlyTyrV« LThtAtpArgGlyLyiAipLy»V»lLy»Ly* 
GCAATCAAAACAAGGAAAAGCAGGATATGTAACAGATAGAGGGAAAGACAAGGTAAAGAA 

LeuGluGlnThrThrAtnG InG laA 1 aG lu L tuG luA 1 a? be A I *Het A 1 *L*uThr As p 
ACTAGAGCAAACTACCAATCAGCAAGCAGAACTAGAAGCCTTTGCGATGCCACTAACACA 

3800 ♦ 

SerGlyProLysValA«oIlelltVtlAtpS«rGlnTyrV«lM«tGlyZl«SeTAl4S«r 
CTCGGGTCCAAAAGTTAATATTATAGTAGACTCACAGTATGTAAIGGGGAICAGTCCAAG 

3900 

GlnProThrGluS«rGluSerLy«lltV«lAftnGlnIleIl«GluGluMeCli«LytLy9 
CCAACCAACAGAGTCAGAAACTAAAATAGTCAAC CAGAXCA TAGAAG AAATGATAAAAAA 

GluAlalleTyrValAUTrpVilProAUHliLyiGlylUGLyGlyAsaCUGluVAl 
GGAAGCAATCTATGTTGCATGGGTCCCAGCCC ACAAAGGC ATAGGGGGAAACCAGGAAGT 

4000 

A»pHiiLtuValS«rGUGlyI L«ArgClnVa lL#uPh«LtuGluLy«IltGluProAl* 
AGATCATTTAGTGAGTCAGGGTATCAGACAAGTGTTCTTC CTGGAAAAAATAGAGCCCGC 
• «•««• 
GlaGluGluHi»GluLy*TyrHiiS«rAsnVtiLyaGluL«u5«rHiiLyiPhtGiyIl« 
TCAGCAAGAACATGAAAAATATCATAGCAATGTAAAAGAACXGTCTCATAAATTTGGAAT 

4100 .... 
ProAtnLeuV«lAUArgClnIl«V«lA«nS€rCytAUGUCyiClDGlnLy«ClyGiu 
ACCCAATTTAGIGGCAAGGCAAATAGTAAACTCATCTGCCCAATGTCAACACAAACGGGA 

4200 

Al*Ilefli»G lyGlnV* IAidAUG luL«uG lyThrTrpGlaMetA.pCy»ThrHi#Lau 
AGCTATACATGCCCAAGTAAATGCAGAACTAGGCACTTGGCAAATGGACTCCACACATTT 
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GluClyLy»Il«Il«IlaV»lAlaV«lIiaValAU8«rClyPa«IU01uAl«61uVal 
AGAACCAAACATCATTATACTAOCACTACATCTTCCAACTGCATTTATACAACCAGAAGT 

. * . 4300 1 

Xl*FroGlnGluS«rGlyArf GlBThrAl«L«uPh«L«uL«uLysL«uAl*S«rArgTrp 
CATCCCACACGAATCAGGAACACAAACACCACTCTTCCTATTGAAACTGOCAAGTAGGTG 

ProIl«ThrHi»LeaHiaThrA«pA»nGlyAlaA»aPh«TBrS«rClaGluV*lLyaM«t 
GCCAATAACACACTTCCATACACATAATGGTGCCAACTTCACTTCACACOAGCTCAACAt 

4400 .... 

V*XAl*rrpTrpIl«ClyXl«GluGlo5«rPl»«ClyV«lProTyrA«ttProClB8«rGln 
GGTAGCATGGTGGATACCTATAGAACAATCCTTTGCAGTACCTTACAATCCACAGACCCA 

* • • • 4300 
ClyV*lV«lGluAlaMatAiaHi»HiaLettLyiA»aGlBll«S«rArgIUAr£CluGla 

ACGAGTAGTAGAAGCAATGAATCACCATCTAAAAAACCAAATAAGTACAAICAGAGAACA 

Al«AinThrIl«GluThrIi«V«lL«uM«tAl«Il«Hi»Cy«M«tAtnPh«LyiArftAr« 
G GC AAATACA ATAG AAAC A A TAG TAG T AAT CC C AATT CATTOC A TGAATTTTAAAAGAAG 

4600 

ClyGlyIltGlyA»pM«tIbrPr©8«rGluArfL«oIltA»ttMttIUTbrThrCluCla 
GGGGGGAATACGGCATArGACTCCAICACAAACATTAAXCAATAIGATCACCAcIo^ 

4 JJi!?2f ClnPbtL * u(!lnAl * L y ,A<aS * rL y BL « uL y«4«pPh«ATg7«lTyrPl>«Ai« 
ACACATACAAITCCTCCAAGCCAAAAATTCAAAATTAAAAGAITTTCGGGXCIAXITCAG 

«... 
CluGlyArgA«pGloI.«uTrptyiGlyProGlyGlttL«oL«uTrpI.yiCly01uGlyAi« 
AGAAGGCAGAGATCACTTGTGGAAAGGACCTGGGGAACTACTGTGGAAAOGAGAAGOACC 

• • . . 4800 

A^i^J^!* y * V * lGlyThrA * pIleLy,ll-11,ProAr * Ar 8Ly«AULyiXl.Il. 
AGTCCTAGTCAAGGTAGGAACAGACATAAAAATAATACCAAGAAGGAAAGCCAAGATCAT 
**•••. 
Ar 8 A«pTyrClyGlyArgGlaGluM«tA«p8erGly8«rHi«L«uGlttGlyAlmArgGlu 
MetGluGluA »Pl'y«ArgTrpHeV«iV.iProIhrTrpArgV«lProClyArg 

CAGACACTATGGAGGAAGACAAGAGATGGATACTGGTTCCCACCTGGAGGGTGCCAGCGA 
• ' . 4900 

AipClyGluM.tAU 

M«tGluLytTrpHi«SerLeuV«lLy»TyrL«uLy»TyrLy»ThrLyfA»pL«uCiuLy« 
CCATGGAGAAATGCCATAGCCTTGTCAAGTATCTAAAATACAAAACAAAGCATCTAGAAA 

V.lCy.TyrV.lProHiiHi.Ly.V.iClyTrpAlaTrpTrpTbrCy.SerArgV.lil. 

AGGTGTGCTATGTTCCCCACCATAACGIGCGATCGGCATCGTCGACTTCCACCAGGGTAA 

3000 . . 

Ph«ProL«uLyaC iyAanSerHiaLauG lu 1 1 «G laA laTy rTrpAanLtuTarProG lu 

TATTCCCATTAAAAGGAAACACTCATCTAGAGATACAGGCATATTGGAACTTAACACCAC 
" 3100 
Ly»ClyTrpL«uS«r SerTyr SerVa lArgIl«ThrTrpTyrThrGluLy«Ph*TrDThr 
AAAAAGGATGGCTCTCCTCTTATTCACTAAGAATAACTTGCTACACAGAAAAGTT^ 

* * • I 

rA A !?riiJi^5 OA8pCy ' AlaA * pV * lLeullaHi6SerThrT y rp h*?r<>Cy.PheThr 
CAGATGTTACCCCAGACTGTCCAGATCTCCTAATACATAGCACTTATTTCCCTTGCTTTA 

5200 

rA A ;fJi^. luValAr8ArgAUXleAr » G1 y GluL y» L « uL « u S«rCyaCyiAanTyrPro 

CAGCAGGTGAAGTAAGAAGAGCCATCAGAGGGGAAAAGTTATTCTCCTGCTGCAATTATC 

••*•«. 

ArgAlaHiiArgAUGlnValPro8erLauGlnPheLtuAl«L«ttV«lValVaIGloG U 

CCCGAGCTCATACACCCCACGTACCGTCACTTCAATTTCTCGCCTTAGTGGTAGTCCAAC 

3300 
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M«tThrA«pProArgGluThrV*lProProGlyAsnS«rGlyGluGluThrIl«G ly 
A«aA«pArgProGlaArgAip3«rThrThrAr»L 7 *GUArgAr|ArjA»pTyrArgArg 
AAAATCACAGACCCCAGAGAGACACTACCACCAGGAAACAGCCCCGAAGA'JACTATCCGA 

5400 

GiuAl*PheAl*TrpL«uA«nArgThrV»iCluAl«I l«AinArgGluAl«V«lA»nHit 

GlyLeuAtgL«uAlaLyiGlnA»pS«rArgSerHiiLyiG lnArg8trS«rGluS«rPro 
GAGCCCTTCCCCTGCCTAAACAGGACAGTAGAAGCCATAAACAGAGAACCACTCAATCAC 

• »«••• 
L«uProArgGluL«uXl*PhcGlaV«lTrpGlaArg3erTrpArgTyrTrpHi«A«pGlu 

ThrProArgThrTyrPta«ProClyV*lAUGluV»lL«uGluIl«L«uAU 
CTACCCCGACAACTTATTTTCCAGCTGTCCCACACGTCCTGGAGAIACTGCCATCATGAA 

5500 . 

GlnGlyM«tS«rGluS«rTyrThrLy»TyrArgTyrL«uCy»H«H«GlnLy«Al»V4l 
CAACGCATGTCAGAAAGTTACACAAACTATAGATATTTGTGCATAAIACAGAAAGCACTG 

TyrM«tHi«V4lArgLy8GlyCyfThrCy«L«uGlyArgC lyHiflClyProGlyGlyTrp 
TACATCCATGTTAGGAAAGGGTGTACTTGCCTGGCGACGGCACATGGGCCAGGAGGGTGG 

5600 ..... 
ArgProClyProFro?roProProProProGlyL«uV«l 

M«tAl«GluAl«FroThrGlu 

AGACCACGGCCTCCTCCTCCTCCCCCTCCAGGTCTCGTCTAATCCCTGAAGCACCAACAC 
..... S700 
LeuProProV*iA»pGlyThrProLeuArgG luProGlyA«pGluTrpIl«Il«GluIl« 
AGCTCCCCCCGGTGGATGGGACCCCACTCACGGAGCCACCGGATGAGTGGATAATAGAAA 

• *«••• 

LeuArgG luZ l«Ly»GluC luAl«LeuLy«Ui»Ph«A«pProArgI.«aL«uXl«Al«L«u 
TCTTCACAGAAATAAAACAAGAAGCTTTAAAGCATTTTGACCCTCGCTTGCTAATTGCTC 
. * . 5800 • . 

M«CGluThrPr«L«uLytAl«FroGluS«rS«rL«u 
GlyLy«Tyrll«TyrTbrArgUi«CiyA«pTbrLeuGluGlyAl«ArgCLuL«uZl«Lyt 
TTGGCAAATATATCTATAC TAGACATGGAGACACCCTTGAACCCGCCAGAGAGCTCATTA 

« • • # • • 

LyiStrCyiAinC luP roPhe SarArglhr SerGluC InAap V« lAlalhrGlnCluLeu 

ValL«uClnArgAlaL€uPh«ThrHiiPh«ArgAl*ClyCy$GlyHi«S«rArgIl«C Ly 
AAGXCCXGCAACGAGCCCTXXXCACGCACXXCACACCACGAXGXGGCCACTCAAGAAIXG 

5900 ♦ 
A laArgGlaGlyG luG lu I l«L«uSerG InLauXy rArgProLeuG luXbrCy lAinAin 

G InThrArgG lyG lyA«nP roL«uS«r Alal laProThrProArgAanMttCln 
GCCAGAC AAGGGGACGAAA7CCTCTCTCAGC TAXACCGAC CCCTAGAAAC AXGCAATAAC 

6000 

SerCyaXyrCy «Ly«ArgCy «Cy »XyrHisCysG InMet Cy sPht LouAanLysG ly L«u 
TCAXGCXAXXGTAAGCGAXGCXGCXACCAXXGXCAGAXGTCTXXXCXAAACAAGGGG CTC 

GlyllaCyalyrGluArgLyaGiyArgArgArgArglfarProLyaLyaXbrLyaTtarlia 

MatAfnGluArgAIaAspG luG luG ly LeuG lnArgLyiLauArgLau II a 
GGGAXAXGXXATGAACGAAAGGGCAGACGAAGAAGCACTCCAAAGAAAACXAAGACXCAX 

6100 

ProSerProXhrProAapLyt 
ArgLeuLeuHisGlnThr 

MttMacAsnGlnLeuLeuIlaAlallaLauLauAla 

ccgxctcctacaccagacaagtgagxaxgaigaaicagcxgcxiatxgccaxxxxaxYag" 

S«rAlaCy»L«uV«lTyrCy«ThrGlnTyrV«lThrV*lPheTycGlyV*lProThrTrp 
CTAGTGCTTGC TTAGTATATTGCACCCAATATGTAACTGTTTTCTATGGCGTACCCACCT 

6200 
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Ly«A§nAi*ThrIl«ProLeuPh«Cy»AUThrArgAinArgAtpThrTrpOlyThrIlt 
GGAAAAATGCAACCATTCCCCTCTTTTCTGCAACCACAAATAGGGATACTTGCGGAACCA 

6300 

ClaCy«Le«i?roA*pA*nA*pAtpTyrG LnGluXleThrLeuAtaValXhrGluAlaPhe 
TACAMCCtTGCCTGACAATGATGATTATCACGAAATAACTTTGAATGTAACAGAGCCTT 

*>•«•*• 
AsgMll«tTrpAgoAf oThrVa IThrGluG lnAlAXleCluAtp ValTr pHitLtuPheG lu 
TTGATGCATCCAATAATACAGTAACACAACAAGCAATAGAAGATGTCTGCCATCTATTCC 

6400 

ThrS«rI l«Ly*ProCy* V«lLy«LtuThrProLeuCy »Va lAl«M«t LyiCyt Str S«r 
AGACATCAATAAAACCATCTGTCAAACTAACACCTTTATGTGTAGCAATGAAATCCAGCA 

ThrGluS«rS«rThrG2yA#nAsnTbrThrS«rLyiStrThr StrTbrThrThrThrThr 
GCACAGACAGCAGCACAGGGAACAACACAACCTCAAAGAGCACAAGCACAACCACAACCA 

6500 , 

ProTfarAipGlnCluC InGlu I l«S«rGluA*pTbrProCy*Al«ArgAl*iAfpAgnCy • 
CACCCACAGACCAGGAGCAAGAGATAAGTGAGGATACTCCATGCGCACGCGCAGACAACT 

• « t • 6600 

SerGlyLauG lyGluG luG luThrIl«A« nCytC laPb«A»QM«tTbrGlyL«uC luArg 
GCTCAGGATTGGGAGAGGAAGAAACGATCAATTGCCAGTTCAATATCACACGATTAGAAA 

**..«. 

AipLyiLyiLyaClnTyrAtnCluThrTrpTyrSarlyaAapValValCyaGluThrAan 
CAGATAAGAAAAAACAGTATAATGAAACATGGTACTCAAAAGATGTGGTTTGTGAGACAA 

. i . 6700 I 

Aio8erThrA»nGinThrGlnCy»TyrM«tA»aHi«Cy«A«oThr5«rV«iIi«ThrClu 
ATAATAGCACAAATCAGACCCAGTGTTACATGAACCATTGCAACACATCAGTCATCACAO 

• »«... 
SerCyiA«pLytBitTyrTrpA«pAl«Zl«ArgFb*ArgTyrCytAl«ProProGlyTyr 

AATCATGTGACAAGCACTATTGGGATGCTATAAGGTTTAGATACTCTCCACCACCGCQTT 

6800 .... 
AlaLauLauArgCyaAanAapThrAanTyrSarGlyPhaAlaProAaaCyaSarlyi Val 
ATGCCCTATTAACATGTAATGATACCAATTATTCAGGCTTTGCACCCAACTGTTCTAAAG 

• . . . . 6900 
ValAlaS«rTtarCy«ThrArgM«tM«tGluThrClnThrSarTbrTrpPheClyPaeAan 

IACTAGCTTCTACATCCACCAGGATGATGCAAACGCAAACIICCACATGGTITGGCTTIA 

GlyThrArgAlmGluAanArgThrTyrl l«TyrTrpEiaGlyArgAspAtnArgThr I Le 
ATGGCACTAGAGCAGAGAATAGAACATA7ATCTATTGGCATGGCAGAGAZAATAGAACTA 

• . . 7000 * * 

I l*S«rL«uA«nLy§TyrTyrA«aL«uS«r LeuUiiCy *Ly t ArgProGlyAjnLy ■ Thr 
rCATCACCTTAAACAAATATXATAATCTCAGTTTCCATTGTAAGAGGCCAGGCAATAACA 

VALLy»GlnIleM«tL€uM«t8«rGlyHitV«lPh«His8erHi$TyrGioProIl«A«n 
CAGTGAAACAAATAATGCTTATGTCACGACATGTGITTCACTCCCACTACCAGCCCATCA 

7100 • 

Ly»ArgProArgGlnAl«TrpCyitrpPh«Ly»GlyLyiTrpLy«AtpAl«iMetG InC lu 
AIAAAACACCCAGACAAGCATGGTGCTGGTTCAAAGCCAAATGGAAAGACGCCArCCAGG 

• . . • ♦ 7 200 
VgiLytGiuTlirL«uAULytEisProArgTyrArgGlyThrA*nA»pThrArgAtnlla 

ACGTGAAGGAAACCCTTCCAAAACATCCCAGGTATAGAGGAACCAATGACACAAGGAATA 

• •«••• 
S«rPheAl*AlaProClyLy*GLySarA«pProGluV«lAlaTyrM«tTrpThrAsnCy • 

TTAGCTTTGCACCGCCAGGAAAAGCCTCACAC CCAGAAGTAGCATACATGTGGACTAACT 

7300 

ArgClyGluPb«L«uTyrCysAaaK«cThrTrpPheLeuA*BTrpIl«CluAinLyaThr 
GCAGACGACAGTTTCTCTACTGCAACATCACTTCCTTCCICAATTGGATAGACAATAACA 
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Ui#A«A.nTyrAUProCyiUi»IUl.yi.CUIleIi«AinTbrTrpUi«Ly»V*lCly 
CAWCGCAAXIAXCCACCCICCCAXAXAAAGCAAAXAAXTAACACAXGOCAXAAGCXAG 

7 400 . • • « 

A»*A«nV«lXyrL«uProProArfC iuClyG lul«u8«rCy»A»oS«rThrV»lXhrf«r 
GGAGAAATGTATATTTGCCTCCCACCCAACOCOACCTGXCCTOCAACTCAACACTAACCA 

1 . • 7S00 

H«Ii«Ai«AinHeA*pTrpClDA«oAinAinGloIhrA»nIleThrPh«8«rAl*Glu 
gcataattgctaacattgactggcaaaacaataatcagacaaacaxtacctitagtgcag 

• ••••• 

V«lAl«GluL«uTyrArgL«uGlul«uGlyA«pXyrLy«L«uV*lGluIl«XhrProXl« 

ACGTCCCAGAACTATACAGATTGGAGTTGGGAGATXAXAAAXTCCTACAAAXAACACCAA 

7600 

GlyPh«Al«ProThrLy«Clul,y«ArgTyrS€rS«rAUHisGlyArgHi«IbrArgGly 
TTGGCTXCGCACCTACAAAAGAAAAAACAIACTCCTCXGCXCACGGGAGACATACAAGAG 

. ♦ • • • 

V«lPbeV«lL«uGlyPh«L«uGlyPb«L«uAlaXhrAl«GlyS«rAl«M«CGlyAl«Al« 

GXGXGXXCGXGCTAGGOTTCXTGGGXTXTCXCGCAACAGCAGGXXCXGCAATGOOCGCCC 

7700 .... 
S«rI.«uIhrV«18«rAl«GlnS«rArgIhrLeuL«uAl«GlyIl«V«lGittGinGlaGln 
CGXCCCXGACCGTGTCGGCXCAGTCCCGGACXTTACXGGCCGGGAIAOXGCACCAACAGC 

. . 7800 

GlnL«uL«uA«pV* 1 V* lLy« ArgG ioG InC iuL«uL«uArgttttXhr?a HrpG lyTbr 
AACACCTGTTGCACGTGGTCAACAGACAACAACAACXGXXGCGACIGACCGXCXGGGGAA 

. 

Ly»A»nLtuGlnAl»ArgV«lXbrAl»Il«GluLyiTyrL«uGlaA»pGlBAl«ATgL«u 

CGAAAAACCICCAGGCAACAGTCACXGCXAXAGAGAAGXACCXACAGCACCACGCGCGGC 

7900 

AinSftrXrpGlyCy«Ai*Pta«ArgGlttV*lCy«HitXhrXhr¥«iProXrpV«iAinA§p 
TAAAXXCAXGGGGATGXGCGTITAGACAAGXCTGCCACACXACXGXACCAXCGCXXAAXC 

SerL«uAl«ProAtpIrpA«pA»nMfttTbrTrpGUGluXrpGluLysGln7«lArgTyr 
ATXCCTXAGCACCTGACTGGGACAAXAXGACGXGCCAGGAATCGGAAAAACAAGXCCCCT 

i . 8000 ♦ • ♦ • 

L€uGiuAl«Atnll«S«rLy»SerLeuGluGinAlaGlnIl«GlnGlnGluLyiA«nM«t 

ACCTGGAGGCAAATATCAGXAAAAGTTTAGAACAGGCACAAAXTCACCAAGAGAAAAATA 

8100 

TyrG luL«uG lnLy«LeuAin8erTrpA«pIlePh«G lyAanTrpPbcAipLcuXhr S e r 
TGTATGAACTACAAAAATTAAATAGCTGGGATATTTTTGGCAATXGGTXTGACTXAAC CT 

XrpV«lLy«IyrIl«GlnTyrGlyV«lLeuIleIl«V*lAl«V»lIl«AlftL«uArgIl« 
CCXGGGXCAAGTATATTCAATATGGAGTCCTTATAATAGTAGCAGXAAXAGCXXTAAGAA 

8200 

V*lIl«XyrV«lV*lC lBM«tLeu8erArgLeuArgLyiG lyTyrArgProV»lPhtS«r 
TAGXGAXATAIGIAGTACAAATGTTAAGIAGGCTTAGAAAGCCCXAXAGCCCXCXXXTCT 

S«rXLflSerXbrArgXbrGlyAipS«rGlnPro 
A»nProTyrProCloGlyProGlyXhrAl«8ftrGln 
8erProProClyTyrIl«ClnGlnH«Hl»Il«HiiLy«A»pAriClyGlnProAl«Ain 
CTXCCCCCCCCGGTTATATCCAACAGAICCAIAXCCACAAGGACCGGGGACAGCCACCCA 

8300 ...» 
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ThrLytLT#ClnLy«Ly»ThrV«lCluAl«Thr7»lCluTbrA»pThrClyFroClyAri 
A*«ArtA«nAtgA*gArgArgTrpLy»GlnArgTrpArg«lnIl«L«uAlgLtuAUAip 
Ma01uIhr01uCluA«pClyClyS«rA»nClyGlyAipArfTyrTrpProTrpPToIl« 
ACOAAOAAACAGAAGAAGACCGTCGAAGCAACOGTCGAGACACATACTCCCCCTCCCCCA 

8400 

S«rIl«TyrThrPbeProA«p?roProAl*AipS«rProL«uA»pG laThr Il«ClnHi» 
Al«TyrIl«HisPh«L«uIl«Arg61sL«ttIl«ArgL«ttL«uThrArgLcuTyr8«rZl« 

TAGCATATATACATXTCCTCATCCGCCACCTGATTCCCCTCTTOACCAGACTATACAGCA 

LeuGlnGlyL«uThrIleClnGluLeuProAapProProThxHisLeuPToGluS«rGIa 
Cy«ArgAipL«uL«uS«rATgSerPb«L«uThrL«uGlaLcuIl«TyrGlnA«nL«uArg 
TCTGCAGCCACTTACTATCCAGCACCTTCCTCACCCTCCAACTCATCTACCAGAATCICA 

. . • 8500 • • 

ArgLauAlAGluthr HacClyAla3«rG lySarLytLya 

AapTrpL«uArgLeuArgThrAl*Ph«L«uGlaTyrGlyCyaGluTrpIl«CloGUAl« 
CAGACTGGCTGAGACTTACAACAGCCTTCTTCCAATATGGGTGCGAGTGGATCCAAOAAG 

Hi«S«rArgProPtoArgciyL«uGlnCluArgL«uLtuArgAUArgAUGlyAUCyt 

Ph«GlaAUAlaAl*ArgAl«fhrArgCluTbrL«uAl«ClyAUCyiArg«lyi«uTrp 
CATTCCAGGCCGCCCCGAGCGCTACAACACA6ACTCTTCCC0CCGCCTCCAG0CCCTICT 

8600 ...» 
GlyGlyTyrTrpAsnG luSerGlyGiyCluTyrS«rArgPh«GlnGluCiyS«rAipArg 

ArgV«lL«uGiuArgIleClyArgCiyIWLeuAl«V«lPr©ArgArgIl*ArgGlnCly 
GCAGGGTATTGGAACGAATCGGGAGGGGAATACTCGCGGTTCCAAGAACGATCAGACAGG 

8700 

G luGlaLyaSerProSerCyaC luG lyArgG InTyrG laG laG lyAapPh«K«CAsaThr 

AlaG lu I ieAlaLeuLeu 
GAGCACAAATCGCCCTCCTCTGAGGGACGGCAGTATCAGCAGGGAGACTTTATGAATACT 

• • • • • • 

ProTrpLysA»pProAl«Al«GluArgG luLy • AmLtuTyrArgG lttC InAinMttAip 
CCATCCAAGGACCCAGCAGCAGAAAGGCAGAAAAATTTGTACAGGCAACAAAATATGGAT 

8800 

A«pVa lAtpS«rA§pA»pAtpA« pG In V« 1 Arg V« lS«r V* IThrProLyt V*lProL«u 
GATCTA'GATTCACATGATGAIGACCAACTAACAGTTTCTGTCACACCAAAACTACCACTA 

«•»••* 
ArgProMecThrUisArgLeuAl«ll«A«pMet SerUitL«uIleLy«TbrArgGlyGly 
AGACCAATGACACATAGATTGGCAATAGATATGTCACATTTAATAAAAACAAGGGGGGCA 

8900 . 

L«uC luClyMetPh«Tyr SerG lu A rgA rgH i »Ly • 1 1 eLeuAsn 1 1 eTy r L«uG luLy s 
CTGGAACGCATGTTTTACAGTGAAAGAAGACATAAAATCTTAAATATAIACTTACAAAAG 

9000 

GluGluG lyllelleAlaAspTrpG lnAsaTyrThrHitG lyProG ly V« lArgTyrPro 
GAAGAAGGGATAATTGC AGATTGGCAGAACTACACTCATGCGCCAGCAGTAAGATACCC A 

• ••••• 
M«tPhtPheGlyTrpL€uTrpLyiL«uV4lProVglAgpV«lProGlaCluGiyClttAfp 
ATGTTCTTTGGGTGGCTATGGAAGCTAGTACCAGTAGATGTCCCAC AAGAACGGGAGGAC 

9100 

ThrG luTbrHi«Cy sLtuV* 1H iiProAlaClnThr 5«rLysPh«AspA«pProHi«G ly 
ACTGAGAC TCACTGCTTAGTACATCCAGCACAAAC AAGCAAGTTTGATGACCCGCATGGG 

• • ♦ • • • 
C luThrLeuVa ITrpC luPh«A«pProLeuL«uAlaTyr8«rTyrG luAl«Ph«Xl«Arg 
GAGACACTAGTCTGGGAG TTTGAT CCCTTGCTGGCTTATAGTTACGAGGCTTTTATTCG G 

9200 . ♦ . » 
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TyrProCluGlu?h«ClyHi*LyaS«rGlyL«uFroGluGluGluTrpLy«AlaArsL«u 
TACCCAGAGGAATTTGCGCACAACTCAGGCCTGCCACACCAAGAGTGGAAGCCCAGACTG 
..... 9300 
' Ly«Al«Ar|GlyIl«ProPb«S«r 
AAACCAAGAGCAATACCATTTAGTTAAAGACAGGAACAGCTATACTTGGTCAGGCCAGGA 

AGTAACTAACAGAAACAGCTCAGACTGCAGCGACTTTCCAGAAGGGGCTGTAACCAAGGG 

. . . 9400 

AGGGACATCGGAGGAGCTGGTGGCCAACGCCCTCATATTCTCTGTATAAATATACCCCCT 

AGCTTGCATTGTACTTCGGTCGCTCTGCGGAGAGGCTGGCAGAT7GAGCCCTGGGAGGTT 

9500 . • . • 

CTCTCCAGCACTAGCAGGTAGAGCCTGGGTGTTCCCTGCTAGACTCTCACCAGCACTTGC 

9600 

CCGGTGCIGGGCAGACGGCCCCACGCTTGCTTGCTTAAAAACCTCCTTAATAAAGCTCCC 
AGTTAGAAGCA 
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Example 5 : Sequences of the Coding Regions 

for the Envelope Protein and GAG 

Product of the ROD HIV-2 Isolate 

Through experimental analysis of the HIV-2 ROD isolate, the 

following sequences were identified for the regions encoding the 

env and gag gene products. One of ordinary skill in the art will 

recognize that the numbering for both gene regions which follow 

begins for convenience with "1" rather than the corresponding 

number for its initial nucleotide as given in Example 4, above, 

in the context of the complete genomic sequence. 
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Envelope sequence 



M*tK9tA«a0lBLiuLtuXl«Al*XlftL«ttL*ttAliStr41ftC7t 
ATOATGAATCACCTCCTTATTGCCATf TTATTAGCTAGTGCTTOC 

• • • • 
L«ttV»lTyrC7$ThrGlnTyrV*lThrV*l?h«TyrGly7«lPro 
TTACTATATTOCACCCAAf ATGIAACTGTTTTCTATOGCQTACCC 

■ • • • • 

ThrTrpLyaA«ftAlaTtrll«Frol.««?fc«CylAl*ThrArtAa» 
ACGTMAAAAATQCAACCATTCCCCTCTTTTGTGCAACCACAAAT 

100 . 
ArsAa»ThrTr»ClyTarXl«ClaCy»l.«a?ToAa»AaaA«»A*p 
AOCCATACTTCCCOAACCATACAaTOCnOCCTCACAATCAtOAt 

TyrcioGlttllaTaxltuAioValTarGlttAlaPhaAayAlaTrp 
TATCA0OAAATAACT?TQAATCTAACA«AO«CtTTTOAfeCAT«« 

200 . • 

A«uA«aThrf*lTBr«luClBAl«IlaGlaA«»TatTr»liaL«» 
AATAATACAGTAACAGAACAAOCAATAGAAGATOTCTCGCATCtA 

• «>••• 
Ph«ClothrltrIl«ly»f roCy»7»XLy$L««IhrfroL««Cyt 
TTCGAGACAIGAATAAAACCATOTGTCAAACTAACACCTTTAtGf 

300 • 
V*lAl*M«cLy»Cyi8«rB«rThr0l»8trS«rThrClyA»aA*tt 
CTAGCAATCAAATOCACCAGCACAGAaAOCAGCACAGGOAACAAO 

ThrThr»«rLy»S«rThrfltrThrTbtThrTbrThrProTJ»rA«p 
ACAACCTCAAACAOCACAAGCACAACOACAACCACACCCAGACAC 

400 

GlaGluGlBGlttXl«S«rCluAspThcProCy«AUArgAiaAta 
CAGGACCAACAGATAACTGACCATACTCCATGCCCACOCQCACAC 

• • • • • 

A*aCyaSarGlyLavClyGluGlttGl«TarXl«A»aCy«6laFa« 
AAC ICC TCAOO ATTCCGAO AOG AAOAAACGATC AATTGCC AGTTC 

• • • • 

AsnUtCtlirGlyLtuGluArgAspLyftLytLysGl&lyrAft&Gl* 
AATATGACAGGATTAGAAAGAGATAAGAAAAAACACTATAATGAA 
SOO • ♦ * . 

ThrTrpTyr8€rLy«A»p Va IVt ICytG luThrA«nA«ai«rThr 
ACATGGTACTCAAAAGATGTGGTTTGTGAGACAAATAATAGCACA 

• • • • 
AsnGlaTbrGlnCy«Tyr!i«CAinBiftCytAanTbrl«r7«lIlt 
AATCAGACCCAGTGTTACATGAACCATTGCAACACATCA6TCATG 

• 600 • • » 
ThrGlu8«rCy«A*pLysHi*TyrTrpAtpAlaIltArg> h«Ar§ 
ACAGAATCATGTGACAAGCACTATTGGGATGCTAtAAGGTTTAGA 

• » • • 

TyrCyaAlaProProGlyTyrALaLaaLattArgCytAtaAspThr 
TACTGTCCACCACCGOGTTATCCCCTATTAAGATGTAAIGATACC 
•. 700 . • 

AaaTyrSarGly*fc«AlaPToABaCy«l«rLysTal?alAUl«c 
AATTAtTCAGOCTTTGCACCCAACTGTTCtAAAGtAOTAGCTTCT 
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TbrCy«thrArflUtM«tCittTlirCloIfcrS«rT&rTrj»h«ei7 
ACATCCACCACC1TCATOCAAAC0CAAACTTCCACAT0OTTT0OC 

•00 

Ph«AinClyThrAr»Al«GluA«nAriT»rTyrIl«T7rTrpIU 
TTTAATOCCACTAOAOCACACAATAOAACATATATCTAITOOCAt 

• » • • 

GlyArg AapA«aAr(TbrIltZltt«rL«aA«aL7iT7rf 7rA«a 
CCCAOACATAATACAACTATCATCACCTtAAACAAATATTATAA.T 

too 

L«uS«rL«ttIi*CyaLyiArtf reClyAaaLytTbr?«lIycfilft 
CTCACTTT0CATI0TAACACCCCA0C0AATAACACACTCAAACA4 

Il«M«tL«uM«tf«rCl7Hii7*lf b«Hi»i«rtl»t7tClBfr» 
ATAATQCTTATCTCAOGACATCTCTTTCACTCCCACTACCAOCCQ 

• ^ • • • • 
IUA$aLyiAr|*r©ArgQinAWTrpC7»Tr>*ta«L7»eijX.yt 
ATCAATAAAACACCCAOACAACCATOOI»CTOOrtCAAA«OCAAA 

iOQO • . 

TrpL7«A«pll«M«tClBClu?*lL7iThrL«uAl«L7ili4Fro 
TCCAAAOACCCCAIOCACCAGCTOAACACCCTTOCAAAACATCCC 

ArtTyrArfCl7ThrA«oAspTtarArsA«Dll«l«rrb«AlaAlt 

AGGTATACACCAACCAATOACACAACCAATATTACCTTTOCACCO 

1100 

ProCly1.yt61yl«rA«p?ro61aT«UUT7zM«tTrpTbrAs« 
CCAGGAAAACCCTCACACCCAGAAGTAOCATACATGTCGACTAAC 

• . • • 4 • 

CysArgGl7GluPfctL*uf 7tC7tA*&M«tTbrTrp?h«LtuAaa 

XGCACAC6A0AGTTTCTCTACTGCAACATCACTT0GTTCCTCAAT 

1200 

TrpIltOluA«nLy»TftrIiiArgA»nTyrAl»ProCy$IiiIl« 
tGGATACAGAATAAGACACACCGCAATTAtGCACCGTGCCATATA 

Ly»cinIl«Ii»A«oThrTrpai«Ly«V«lGlyAr»AinVtll7t 
AAGCAAATAATTAACACATGG CATAAGCTAGGGAGAAATGTATAT 

1300 

L«uProProA?tCluOl7CluL«u8«rCytA»Bl«rThrV*lTbr 
TTGCCTCCCAGGGAAGGGGAGCTCTCCTGCAACTCAACAGTAACC 

• • • ♦ • 

t«rXlftXl«AlftA«nIl«AtpTrpGlnA«ftAsnAt&GlaThrAta 
AGCATAATT0CTAACATTGACTG6CAAAACAATAATCACACAAAC 

» • • • 

IleTbr?h«»«rAl*Clu7*lAUCluL«»TyrAr|L«uCluL«u 
ATTACCTTTAGTGCAGAGGTOGCAGAACTAIACACAITGGAGTTG 

1A00 . . 

ClyA«pTyrLy»I.«uV»iaiuIl«Tbr?roIUGlyPta*AUPro 
SCAGATTATAAATtCCtAGAAATAACACCAATTGGCTTCGCACCT 
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Tarty »G laLy •XriTyr 3a riarAl al liC lyArgBUTarArg 
A C AAAAC AAAAAACAT A CTCCTCTOC TC AC GGCAG AC ATAC AA6A 
1300 

Gly?*iPBaValLauGlyPaalattGlyf aaLaaAiATatAlaOiy 
CGT6T0TTCGTCCTAGCOTTCTTGOCTTTTCTCCCAACAOCAW* 

• • • • 
SarAlaMatGlyAlaArgAlaf •rLtuTbrf Alt«rAlsGUt«f 

TCTOCAATCOOCGCTCOACCOTCCCtOiACCGTGTCGGCTCAGTCC 

1600 , • 

ArgThrL«uL«uAU0lyIit7«lCla«lsGiBGloGlaL«ttlM 

CGGACTTTACTGGCCCGGAIAGTGCAGCAACAGCAACAGCTGtTC 

« • • • 

AipValValLjiArgClaClaCluLaulauArgLauTarTalTrp 
GACGT06TCAA0ACACAACAAOAACTOTTCCGACTGACCCTCTW 

1700 

ClyTarLy»A«aL«u61nAlaArg fait arAlallaOlulyatytr 
G GAACGAAAAACC TCCAGGCAAGAGTCACTGCTATAGAGAA«tAC 

• . • • 
L«uGlaAspGlaAlaArsL«uAsttS«rtrp61yCyiAl«rfc«A.r$ 
CTACAGOACCAGOCCCOCCTAAAtTCATCCCGATGTGCGTTtAOA 

• • • * 1^0(0 
GlaVaiCyaliaThrTarfalP rotrpf alAaaAialariaaAU 
CAAGTCTCCCACACtACTOTACCATGGOTTAAIGAtTCCtTAGCA 

• • • • • 
ProA»pTrpA«pA»oM«cThrTrpClaOluTrpGl*I.y»01a?al 

CCTGACTG6GACAATATGACGTGG CAGQAATCGQAAAAACAAGTC 

• • • • • 
A r g T y r L • uG 1 uA 1 «A a al l'i S t r Ly a t« r lYuG 1 uiO In Al al la " 
CGCTACCTGGAGGCAAATATCAGTAAAAGTTTAGAACAGGGAjCAA 

1900 . « • - 

Il«GlaClnCiuLyaAanMatTyrGluLattGlaLyaLauAaaJa» 
ATTCAOCAAQACAAAAATATCTATCAACTACAAAAATTAAATAOC 

• • • • • 
TrpA»pIi«Ph«GlyA»aTrpP haAtpLauThrSarTrpValLya 

TCGGATATTTTTCGCAATTGGTTTGACTtAACCTCCTCGGtCAAC 

2000 

TyrIlaGinTyr01yYaiLauIltIlaValAliyalliaAlaX.au 
TATATTCAATATCGACTGCTTATAATAOTAGCACTAATAOCTTTA 

• • • • • 
ArgIlaValZlaTyrV4l7alGlaH«tLauStrAr|L«uArgLy« 
AGAATAGICATATATCTACTACAAATGTTAAOTAOCCTTAGAAAG 

2100 

GlyTyrArgProValPhaSarSarProProGlyTyrliaOla*** 
GGCTATAGGCCTGTYTTCTCTTCCCCCCOCGGTTAIATCCAAtAO 
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Il«Hl»Il«HiiiyiAipArgClyGl»ProAl»AtnOlttOWthr 
A1CCATATCCACAACGACCG6CGACA0CCA6CCAACCAACAAACA 

2200 

«luGl»»A«pGlyGlyS«rA«o61yClyA»pArgtyrTrpProTrp 
CAACAACACCCTCCAACCAACCCTCCAOACACATACTOOCCCtflC 

I^oIl«Ai*TyrIliiiiPb«L«ttiltAtt«inL«ttIUArgL«a 
aCCATAGCATATATACATTTCCTCATCCGCCACCTGATTCGCCTC 

• • • « 

l*uTfcrArgL«uTyrS«rIl«Cy«Ar§AtpL«ol«ttt«TArgl«t 
TtCACCAOACTATACACCATCTCCACQCACTTACTATCCAGCAOC 

MOO . 
f%«L«uttarl«uCloL«uIl«TyrCl«A»nWttArgA»p*rprL»« 
TTCCTCACCCTCCAACTCATCTACCAQAATCTCACAGACTQQCTG 

A^fL«uArgTarAUfh«l«uQlnTyrGlyCy»6lutrpIl««la 
*eACTTACAACACCCTTCrrCCAATATGOCTCCGAGTGOAtCCAA 
2400 . . • 

GluAl«Fh«01aAl«Al*Al*At»Al»tbrArgGluTbrL«ttAl« 
GAAOCATTCCACCCCCCCQCQAOCCCXACAAGACACACTCTTQCG 

ClyAl*CyiArgGlyL«uTrpArgT*ll«uGlttATiIltClyA»g 
GGCGCCTCCAGCCCCTTCTGGAGCGTATTGCAACOAAtCGOCAOfl 

2500 

GlyIl«L«uAUV*l?roAr»ArgIi«ArgClaOlyAWGlttIl« 
GCAATACTCGCOGnCCAACAACCATCACACACCCACCAGAAATC 

Al*L«uLtu***01yTbrAi*?»18«rAl»GlyArgt«uTyrClo 
GCC C TCC TGTGAGGGACGG C ACTATC ACCACOGAGACTTTATGAA 

2600 

Xyr8trH«tCluClyf roS«rS«rAr»X.y»ClyGluLy»?h«Tal 
TACTCCATCOAACCACCCACCACCAGAAAQGGAOAAAAATTTOTA 

. . • • 

ClnAUTbrLytTyrGly 
CAGGCAACAAAATATGGA 
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Gag sequence 



iT^WCCGAGA^CTCCGtCTTCACAGGOAAAAAAOCAGATCAA 

TXAOAAAOAAtCACCTTACGGCCCCWCAAAOAAAAAOTACA** 
. • * • •• 

l.»Ly«RiiIla?alTrpAl«AlaA««XjiI*«AifArft«««l7 
CTAAAACATATTCTCTCOCCAGCOAATAAATTOOACAQATTCWA 

100 • • • • 

LiuAl«CLu8ttL«ttL«ttGlul«rLy«Cl«0l7Cy»0laL7»IU 

TTAGCAGAGAGCClGTTGGAGTCAAAAGAGGGt7GTCAAAAA4TT 

L«uTbr7«lLtttA»pPToMtcT4iProThrGl7*«r01ttA»aIr*« 
CTTACAGTTTTAGATCCAAIOGTACCGACAGCTTCAGAAAATW 

200 

Ly»StrL«ttPhiA«nTljr7»lCyiV4lIltTrp<:y*Il«Bl»Al« 
AAAAGTCTTTttAATACtGTCTGCCTCATTTGGTGCAtACACCCA 

♦ * • • • 

CluGluLy»V*lLyiA«pThrGlttClyAl«L7tGlnIl«V»lArj. 

GAAGAGAAAGTGAAAGATACTGAAGGAGGAAAAGAAATAGTGCM 

sot 

Ar»HiiL«u7»lAl«GluTlir0l7ThrAl«GlttL7»M«tfro»«r 
A GA C AT C TAGTOO C AGAAACAGOAACTGCAOAGAAAAf OCCAAGC 

. . • • • 

IhrS«rArgProTlirAl»Frof«ri«rGlaI.7»*l7«lt*»» t 7» 
ACAAGIAGACCAACAGCAGCATCTAGCOAGAAOGGAGGAAATTAC 

ProValGlaHi»VaiGi7oijrA»nT7tTarEUIUProL«aSar 
CCAGTCCAACATOTAGCCGOCAACtACACCCAtAfACCCCTCACT 

• • • • ♦ 

?roArgThrL«»A»ttAl*Trp7«iX.y»I.«uy»lGluCluL7»lor* 
C CC C GAACCC TAAATGCC TGGGTAAAATT AGTAGAGGAAAAAAAft 

• • • • 
Ph«GlyAl«GluV*lV«l?roGl7Pfc«GlBAlAL«u8«rGluCl7 
TTCGGCGCAGAAGTAGTGCCAGGATTTCAGGCACTCTCACAA60C 

300 ...» 
Cy«ThrProIyrA«pIl«A»nGlnM«tL«uA«oCy«V»lGlyAip 
TGCACGCCCTAtGAIATCAACCAAATGCTTAATTGTGTGGGCGAC 

Hi*61aAl«AlalI«tGlBll«IltAr«GUXi«IttAiaGlttCl« 
CATCAAGCAGCCATGCAGATAATCAGGGAGATTATCAATGAGGAA 

•00 • • • 

Al*Al«CluTrpA«pV»lGlaHWPr©tl«ProGlyfroL«»fr» 

gcagcaoaatggoaictccaacatccaataccaccccccttacca 

• • • • 
AlaGlyGlaL«BArsGluProArfGl7t«rAt»Il*AUCl7fk* 

GCGGGGCAGC7IAGAGAGCCAAGGGGA7C7GACATAGCAGG«Afi* 

700 . m 

ThrS«rTar7«lCluCluClDll«01aTTp««tPh«Ar»Pr»«l» 
ACAAGCACAGTAGAAGAACAGATCCAGTGGATOTITAGGCCACA* 
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Aa«ftafal*roT«ieiyA«aU«Ty*ArtArgtr»XU01an« 
AATCCWTACCACTAOOAAACATCtATAOAACATOCATCCAGATA 
. • • 800 • 

ClyL«ttClnty»CyiV*lAr|M«tTyrAittProTbrAiaIi«L«m 
CCAITOCAOAAfilCTSTCACCAIOTACAACCCOACCAACAtCCTA 

. • • • t 

A»pIl«Ly»ClaCly>toLyiOluf rof h»0lo8«rTyrf*lAi> 

GACATAAAACACGCACCAAAGGA6CCGTTCCAAAGCTAT8TAGAT 

«00 

ArtP&aTyrlytBarLauArgAUCluCUTaiAapPr©AiaT»l 
AGATTCTACAAAACCTTCACGOCACAACAAACACATCCAOCAOTQ 

• ♦ ♦ • 
LyaAaaTrpXatTarOlaTarLauLaofalGlaAaaAlaAamfr* 
AACAATTGGATGACCCAAACACTGCTACTACAAAATGCCAACCCA 

• • • • • 
AapCyalyaLattValLauLyaClyLauClyltatAaaf roYfcrLa* 
GACTGTAAAITAGTGCTAAAACCACIACCCAT0AACCCTAC5TTA 

1000 

Glu01ttM«tL«uTDfAl»CyiCla01y?alCly01y*r»OiyCU 
GAAGACATCCTCACCOCCTQTCAOQOCtTACGTCCGCCAOOCCAO 

LyaAl»ArgLauXatAl»CluAWLauly«GUTalIlaClyPro 
AAAGCTACATTAATOGCACAGGCCCTGAAAOAGGTCATAOOACCT 

1100 

AUProIl.ProPhaAlaAlaAUClaOlaArfLyaAUPaaLya 
GCCCCTATCCCATTCGCACCAGCCCAGCAGAGAAAGGCAIttAAA 

• • ♦ - • 
CyaTrpAaaCyaClyLyaGiuGlyHiatarAlaArgGlaCytAxg 
TGCTGGAACTGTCGAAACCAAGGCCACTCCCCAAOACAATGCCGA 

1200 

AlaProArgArgClBGlyCyaTrplyaeyaGlyLyaProGlylia 
GCACCTAGAAGGCAGCGCTGCTGGAAOTCTGGTAAGCCAGOACAC 

• • « t • 
Il«MttThrA*aCyaProAapArtOlaAla01yPaaL»uClyLa» 
ATCATCACAAACT0CCCACATAGACA00CACCTTTTTTAO0ACT0 

1300 

GlyProTrpClyLyalyaJroArgAaaPtaaProValAiaGlaVai 
GCCCCTTGGGGAAAGAAGCCCCGCAACTTCCCCCTCGCCCAAGTt 

t * • • • 

ProGlaClyLauTarPtoThrAlaProfroVa UapProAUVal 

ccccagggcctcacaccaacagcacccccactggatccagcagto 

» • • • 

AapLauL«uGluLyaTyrH«tGlnGlaGlyLyaArgGlaArgCl« 
GATCTAC7GGAGAAATATATGCAGCAA0CCAAAA0ACAGAGACAG 

1400 . 
GlaArgGlttArg?roTyrLysGluValThrGluAapL«ttL««Hia 
CAGAGAGAGAGACCATACAAGGAAGTGACAGAGGACTTACTGCAC 

• • • • 

LtuGluGlnClyGlttThrFroTyrArgGlaProf reXferGluAap 
CTCGAGCAG6GGGAGACACCATACACGGAGCCACCAACAGAGGAC 

1S00 ♦ a • 

LftuL«uHiftL«uA$aS«rL«u>btGlyL7tAtpGl« 
TTGCTCCACC TCAATtCTCTCTITCCAAAAGACCAG 
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Example 6 ; Peptide Sequences Encoded By 
The ENV and GAG genes 



The following coding regions for antigenic peptides, identi- 
fied for convenience only by the nucleotide numbers of Example 5, 
within the env and gag gene regions are of particular interest. 

envl (1732-1809) _ 

Ar g»a ITarA i al 1 tC lut ya*ft 
AGAGTCACTGCTATAGAGAAftAft 

• • 

L«uOlaAiaGlaAlaAraL«ttAtafartrpClyCyiAlAlfcaA*t 
CTACAOGACCAGGCGCOGCTAAATTCATGCGCATCTGCGTTTAM 

. • • ♦ *' 

GlaVeiCya 
CAAGTCTGC 



env2 (1912-1983) 

S«rLysfrarL«ttGluGiaAlaai» 
AGTAAAAGTTTAOAACAGGCAfiAA 

1 1 tG laClaOluLy»A«nM« tlyrG luLauG IbL y •LavAaaf at 
ATTCAQCAAOAQAAAAAf ATGTATGAACTACAAAAATTAAATAGC 

Trp 
ICG 



env3 (1482-1530) 



tt9 TftrLytGluLy»Arit7rS«rS«rAl4HU01yAriEi»TfcrA*$ 
CCI ACAAAACAAAAAACATACTCCTCTGCTCACQGCAGACATACAACA 
1300 



env4 (55-129) 

CytTbrGlaTyrV* lThrV*i?h«TyrCiy?«iPTa 
TGCACCCAATATCTAACTGTTTTCTATGGCOTACCC 

• • • • 

ThrTrply«A»aAlaTarIlaf rolaaPaaCytAlalht 
ACGTGGAAAAATGCAACCATTCCCCT6TTTTGTGCAACC 
100 
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env5 (175-231) 

AtpAlp 
CATOAT 

TAT CA0OAAATAACTTTOAAT0TAA C AGAOCC TTTTCAt C CATOO 

200 • • 

AsnAta 
AATAAT 



env6 (274-330) 



CloThrl«rri«Ly»?roCy»V»lLy«L«oThrProL«oCy» 
GACAC AT CAATAAAAC C AT GT6TC AAAC TAAC AGC TTTAXQT 

S00 . 

V«iAl«M«tI.ytCya 
GTAGCAATGAAATGC 



env7 (607-660) 

AsaEiaCytAsaThrl«rT«lll« 

AACCATTGCAACACATCAGTCATC 

felQ . - 

ThrGluStrCyiA«pLy»Hi»TyrTrpAip 
ACAGAATCATGTGACAAGCACTATTGOGAT 



env8 (661-720) 

AlaXl«ArtFh«Art 
CCTATAAGGTTTAGA 

TyrCy«AlAPro*roGlyTyrAlaL«uL«ttArgCytAtaAspThr 
TACTGTGCACCACCGGGTTATGCCCTATTAAGATGTAATGATACC 
_ 700 



env9 (997-1044) 

LyiArs?roArgGlnAltTrpCysTrpFh«LytGlyLy« 
AAAACACCCAGACAAGCAT00TGCTGGTTCAAA6GCAAA 
1000 . . ■ 

TrpLy»A»p 

TGGAAAGAC 

-4 2- - 



envlO (1132-1215) 



LysO l7l«rA*pFroC In? 4 lAUTyrMt cTrpThrAi* 
AAACCCTCACACCCACAACTAOCATACATGTCaACTAAC 

• • « • 

CytAr|Ctyaiu»htL«ut7rCy»A»BH«tXhrTr»Pb«L«*Ai» 
TGCACACOAOAOTTTCTCTACTCCAACATCACTTOOTTCCTCAAt 

1200 



envll (1237-1305) 

Ar»A»BTytAU?roCyiliaIlt 
CCCAATTAtCCACCCTCCCATAtA 

• • • 

LytGlall«Il«A«aThrTrpIiiLy«7»iClyAr|AinV»lT7T 
AACCAAATAATTAACACATCCCATAACCTACCOACAAATCTATAT 

1300 



gagl (991-1053) 

A«pCysl7aL««7alL«u!.y tClyLtutlyMatA^afroTtrLam 
GACT0TAAATTACTCCIAAAACCACTA6CCATCAACCCTACCTIA 
1000 

GluOWMatLtuTatAU 

GAACAGATGCTGACCOCC 



Of the foregoing peptides, envl, env2, env3 and gagl are 
particularly contemplated for diagnostic purposes, and env4 , 
env5, env6, env7 , env8 , env9, envlO and envll are particularly 
contemplated as protecting agents. These peptides have been se 
lected in part because of their sequence homology to certain of 
the envelope and gag protein products of other of the ret- 
roviruses in the HIV group. For vaccinating purposes, the fere 
going peptides may be coupled to a carrier protein by utilizirg 
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suitable and well known techniques to enhance the host's immune 
response* Adjuvants such as calcium phosphate or alum hydroxide 
may also be added. The foregoing peptides can be synthesized by 
conventional protein synthesis techniques, such as that of 
Merrif ield. 

It will be apparent to those skilled in the art that various 
modifications and variations can be made in the processes and 
products of the present invention. Thus, it is intended that the 
present application cover the modifications and variations of 
this invention provided they come within the scope of the 
appended claims and their equivalents. For convenience in inter- 
preting the following claims, the following table sets forth the 
correspondence between codon codes and amino acids and the corre- 
spondence between three-letter and one-letter amino acid symbols. 
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DNA. COOON 



AMIND ACID 3 LET . AMINO ACID I LET • 



: : \2l TCAG:TCAG:TCAG: 
i 1 t 3\» s ' * 



• * 

* « 


T 


: 


TTT 


TCT 


TAT 


TGT 


: 


PHE 


SER 


TYR 


CYS : 


F ! 


i Y 


C 


t T I 


C 


3 


TTC 


TCC 


TAC 


TGC 


• 
« 


PHE 


SER 


TYR 


CYS : 


f ; 


> Y 


C 


3 3 


A 


t 


TT A 


TC A 


T A A 




1 


l- C W 


«CD 




• *# t 


i < 
U 


t ~ 




: t 


G 


* 
♦ 


TTG 


TCG 


TAG 


TCG 


3 


LEU 


SER 


**♦ 


TRP i 


L J 


i * 


M 


: : 


T 


• 

* 


CTT 


CCT 


CAT 


CCT 


• 

* 


LEU 


PRO 


HIS 


ARC i 


L f 


» H 


R 


* C x 


C 


« 


CTC 


CCC 


CAC 


CGC 


X 


LEU 


PRO 


HIS 


ARC : 


L t 


► H 


R 


: i 


A 


* 
• 


CTA 


CCA 


CAA 


CGA 


• 


LEU 


PRO 


GLN 


ARG J 


L f 


» 0 


R 


* * 

* * 


G 


m 


CTG 


CCG 


CAG 


CCG 




LEU 


PRO 


GLN 


ARC : 


L f 


» 0 


R 


r x 


T 


* 


ATT 


ACT 


AAT 


ACT 


* 
* 


ILE 


THR 


ASN 


SER : 


I 1 


r h 


S 


: A : 


C 


« 


ATC 


ACC 


AAC 


AGC 


* 

* 


ILE 


THR 


ASN 


SER : 


I 1 


r n 


S 


« * 


A 




ATA 


ACA 


AAA 


AGA 


3 


ILE 


THR 


LYS 


ARG I 


I 1 


r k 


R 


* ■ 

« • 


G 




ATG 


ACG 


AAG 


AGG 


3 


flET 


THR 


LYS 


ARG 1 


M 1 


r k 


R 


: i 


T 


3 


GTT 


CCT 


CAT 


GGT 


• 


VAL 


ALA 


ASP 


GLY t 


V J 


V 0 


G 


: C : 


C 


• 
• 


GTC 


GCC 


GAC 


GGC 


: 


VAL 


ALA 


ASP 


CLY : 


V / 


t 0 


G 


: 3 


A 


* 
• 


CTA 


GCA 


GAA 


GGA 


* 


VAL 


ALA 


GLU 


GLY : 


V / 


k E 


G 


• « 


G 


• 


GTG 


CCG 


GAG 


GGG 


3 


VAL 


ALA 


GLU 


GLY : 


V t 


k E 


G 



3 Letter 


1 Letter 


COOQNS 






ALA 


A 


GCT 


GCC 


GCA 


GCG 


ARG 


R 


CCT 


CGC 


CGA 


CGG 


ASN 


N 


AAT 


AAC 






ASP 


0 


GAT 


GAC 






CYS 


C 


TGT 


TCC 






GLN 


0 


CAA 


CAG 






GLU 


E 


GAA 


GAG 






GLY 


G 


GGT 


GGC 


CGA 


GGG 


HIS 


H 


CAT 


CAC 






ILE 


I 


ATT 


ATC 


ATA 




LEU 


L 


CTT 


CTC 


CTA 


CTC 


LYS 


K 


AAA 


AAG 






NET 


« 


ATG 








PHE 


F 


TTT 


TTC 






PRO 


P 


CCT 


CCC 


CCA 


CCG 


SER 


S 


TCT 


TCC 


TCA 


TCG 


THR 


T 


ACT 


ACC 


ACA 


ACG 


TRP 


M 


TGC 








TYR 


Y 


TAT 


TAC 






VAL 


V 


GTT 


GTC 


GT4 


GTG 


• #* 


♦ 


TAA 


TAG 


TGA 
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